August 2008 Archives

The Twitter Unicode Test: A


Just for kicks, I decided to run Twitter through some Unicode tests, and I give it an A. For the record, I pretty much knew from Twittervision it supported a lot of encodings, but I threw a few more exotic tests...just to see.

The first was my standard phonetic character test...from a Mac. As far, as I'm concerned you have to pass this to be a serious global Unicode contender in my book. I also through in a long vowel (ā) and the one Hebrew word I can type (שבלת) or "shibboleth" to confirm right to left support.

What impressed me though, oddly was the the support for entity codes like é (é) and &&x0909; (उ);. Twitter can accept either raw é or it can take é and convert it to é. This differs from other modern tools like Facebook or XML which can only accept raw Unicode input (entity codes break).

Accepting either format is probably a pain to program, but very nice for the user. Having to remember when to enter entity codes and when to enter raw Unicode is confusing, but still an all-too-common reality. I appreciate Twitter for making the transition a little easier... even it's only for 140 characters.

Screen capture of Twitter messages with Unicode Characters in test messages


Hebrew Computing Listserv


If you are working with Hebrew, a helpful list may be the Hebrew Computing User Group on Yahoo. You have to join the list to see the messages, but they do cover a wide range of topics.

For other resources, you can check the Penn State Hebrew Computing Information Page (which, by pure coincedence, I edit).


Chinese Olympic Pictograms


One of the more interesting "color" pieces on the U.S. Olympics coverage on NBC was a piece on how the icons for the different sports were inspired by early Chinese pictograms, which were the precursors of the modern Chinese characters.

You can read a bit more about the design process in this article from the People's Daily Online.

The use of ancient art for modern Olympic pictograms is not new (see the entries from Athens, Syndney, Lillehammer and Salt Lake City) but I think this was the first time it made it to television.


Yale Chinese Support Site


Despite some of my previous entries, it's a fact that I really know very little about Chinese writing (I think I can recognize the characters 1,2,3). But if I really had to figure out what was going on the first place I would probably go to is Yale Chinese Mac which started back in the Mac Classic days.

Ironically though, the site is no longer just Chinese on a Mac, but includes information on Chinese on Windows, Chinese on Palm Pilot, encodings, free fonts and more. Many mysteries can be resolved here. If only I could find one of these for every script!



Math Magic Equation Editor & Unicode Fonts


One challenge for math is laying out the actual equations like this integral below.

Integral of C sub v ( T ) d T from T to T sub ref

The tool of choice of for many in the math/science industry is the equation editor which allows you to insert text and symbols into different "layouts" (e.g. an integral, fraction, matrix, etc). See the image at the bottom. It's a lot quicker than Illustrator. And an equation editor can usually export the output in different graphics formats and some can export LaTeX and MathML (Ooooh!) I chose Math Magic primarily because it works on a Mac as well as Windows, but it's similar to other tools I have seen including the one bundled with Microsoft Office.

The one quirk that I previously developed methods to insert Unicode symbols via the Character Pallette or custom math symbol keyboard. Another time you might need to use non-Math Magic character insertion if you are using an especially exotic character (this happened to me once).

However, when I tried the Character Palette on MathMagic, the result was the square box of death meaning the character did not "exist." Fortunately...I realized that it was a font issue. As soon as I switched to a dedicated Math Unicode font like Unicode Symbols, all was well. But now I wonder about the default font.

The quirky fonts are not a problem if you're exporting an image or working with text, but if it's MathML it could be problematic (but maybe I'm being paranoid). In any case, I sense a future MathML test coming.

Typical Equation Editor Interface

Math Magic Interface. Tool bar includes templates with squares where text can be inserted

Postscript: The MathML Test

The good news was that I was able to export a Math ML file and get the result to work in another HTML page. I should note that the <?xml...?> does not specify UTF-8 encoding. In theory, this shouldn't be a problem, but I might add the "encoding=UTF-8" part to make sure nothing weird is happening. The file also includes a custom <annotation encoding="MathMagic"> tag which is filled with vendor-generated code. I'm not sure what this does, but I will probably leave it in...just in case


About The Blog

I am a Penn State technology specialist with a degree in linguistics and have maintained the Penn State Computing with Accents page since 2000.

See Elizabeth Pyatt's Homepage ( for a profile.


The standard commenting utility has been disabled due to hungry spam. If you have a comment, please feel free to drop me a line at (

Powered by Movable Type Pro

Recent Comments