The Twitter Unicode Test: A


Just for kicks, I decided to run Twitter through some Unicode tests, and I give it an A. For the record, I pretty much knew from Twittervision it supported a lot of encodings, but I threw a few more exotic tests...just to see.

The first was my standard phonetic character test...from a Mac. As far, as I'm concerned you have to pass this to be a serious global Unicode contender in my book. I also through in a long vowel (ā) and the one Hebrew word I can type (שבלת) or "shibboleth" to confirm right to left support.

What impressed me though, oddly was the the support for entity codes like é (é) and &&x0909; (उ);. Twitter can accept either raw é or it can take é and convert it to é. This differs from other modern tools like Facebook or XML which can only accept raw Unicode input (entity codes break).

Accepting either format is probably a pain to program, but very nice for the user. Having to remember when to enter entity codes and when to enter raw Unicode is confusing, but still an all-too-common reality. I appreciate Twitter for making the transition a little easier... even it's only for 140 characters.

Screen capture of Twitter messages with Unicode Characters in test messages

About The Blog

I am a Penn State technology specialist with a degree in linguistics and have maintained the Penn State Computing with Accents page since 2000.

See Elizabeth Pyatt's Homepage ( for a profile.


The standard commenting utility has been disabled due to hungry spam. If you have a comment, please feel free to drop me a line at (

Powered by Movable Type Pro

Recent Comments