ELIZABETH J PYATT: October 2009 Archives

Ancient Egyptian & Other Additions in Unicode 5.2


The latest Unicode Standard, Version 5.2, was released at the beginning of October, 2009. A lot is added each standard, but I confess that the most noteworthy for me was that an Egyptian Heiroglyphic block (U+13000 to U+1342E) was added. It was certainly the largest block added at 1071 code points.

Additional code points added included blocks for Avestan, Old South Arabic, Samaratian, Imperial Aramaic, Inscriptional Parthian, Old Turkic. In addition, supporting characters were added for the Coptic, Devanagari (esp Vedic support), Hangul (Old Korean), Phonecian and other ancient script blocks.

In South and Southeast Asia, support was added for Javanese, Tai Tham, Lisu, Kaithi, Meitei Mayak, Myanmar (new points), New Tai Lue (new points) and others. In other regions, a new Caniadian Aboriginal Syllabics Extended block was created with 80 additional code points. Some African scripts were also encoded including the Banum script and Rumi numerals. Additions were also made to various math and symbol blocks.

For a complete list of changes, see the information on the DerivedAge.txt file (scroll to end) and Revised Unicode 5.2 charts. In terms of support, there may be freeware (or commercial) fonts available, but time will be needed to develop the input utilities and then for these glyphs to be incorporated into major operating systems.

Until then...there's always Unicode 6.0.


Emoji at Unicode 33


Defining Emoji

There were lots of interesting sessions at last week's Unicode conference, but the one that I think non-experts can relate to the most was the one about Emoji or those little tiny icons popular in Japanese e-mail messages.

A rough translation of emoji might be emoticon, but the range of images goes way beyond smiley faces to include weather symbols, hearts, beer steins, sports icons, high heels,fast food, astrological signs, warnings, hand gestures and bikinis.

Why Unicode?

It's good to catalog and standardize any symbol set, but in this case economic necessity is driving this campaign. Specifically, Google and Apple (and its iPhone) who want to expand more into the Japanese market.

According to our presenters, the three major Japanese cell phone carriers all support emoji, and these images are popular with most adults (even the ones over 30). It's an important enough feature that iPhone (and iChat), Gmail and even Twitter support emoji.

But really it would be good to support one encoded set of emoji, not a hack of three emoji encodings from the Japanese cell phone carriers...hence the need for a unified encoding which combines those items already encoded (e.g. zodiac symbols) with symbols not currently in Unicode.

Remaining Issues

Because no Unicode script block is free of quirks, I document the issues overheard at the conference and at the Web. Namely:

  1. Color - Real emoji have colors (really bright ones), but the spec is in black and white. This makes sense because the rest of Unicode is also in black and white. Plus you will have more options to add the colors you want!

  2. 5-Digit Code Points - Or more technically, the new glyphs will be assigned a number above U+FFFF (i.e. not in the BMP or Plane 0). Not surprisingly, many mobile devices are limited to U+FFFF and below. The committee's comment was that they expected that moble developers would learn to overcome this restriction...because they really are running out of room in the U+0000-FFFF range. That may be good news for anyone wanting to transmit the ancient scripts over cell phones. You never know when you need to access a Mycenaean Greek text away from the office or when the next Linear B revival may happen.

  3. There's a Jailbreak App for that - When researching this article I encountered articles about tricks for enabling emoji on non-Japanese iPhones, not all of which were legit. For a while, Apple was discouraging use of emoji outside of Japan so it was hiding the emoji. Fortunately, there is a legal way to enable emoji now (both a trick and an app).

So there you have it - thanks to the great folks at Google and Apple, we will all be able to standardize the addition of cute icons in our online communication...or at least we will have a documented explanation of what they were for future generations. Trust me, in about 500 years, we will need it.


Unicode 33 Presentation Files


About The Blog

I am a Penn State technology specialist with a degree in linguistics and have maintained the Penn State Computing with Accents page since 2000.

See Elizabeth Pyatt's Homepage (ejp10@psu.edu) for a profile.


The standard commenting utility has been disabled due to hungry spam. If you have a comment, please feel free to drop me a line at (ejp10@psu.edu).

Powered by Movable Type Pro

Recent Comments