Where Have All the Escape Codes Gone?


I'm currently preparing a seminar on Unicode and I was struck by how far Unicode implementation, especially in terms of raw Unicode text, has come in the past 4 years. Some of the warnings I used to present in 2000 or even in 2004 seem almost quaint now.

For instance when Mac OS X first came out, the older applications were not set up to take advantage of the Mac Unicode utilities, such as the U.S. Extended keyboards. I used to have to specify which applications could work with Unicode and which couldn't do it. But yesterday I realized that I couldn't find any old applications on my machine that didn't work correctly. What a difference that makes.

The same is true on the Windows side. If you get the latest version of most applications, the chances are that Unicode support is there - even for raw text editors.

Similarly, I recall when many HTML editors converted any non-English character to an numeric HTML entity, but now most applications are set to work with real UTF-8 text embeded in HTML tags. This is much easier to edit and crucial for being able to transfer data between the Web and other XML resources.

Russian, Chinese and Greek data are being treated as just "text" and not as a special case that programmers need to agonize over. There are still plenty of issues to be worked out, but it's good to appreciate progress when it's made.

About The Blog

I am a Penn State technology specialist with a degree in linguistics and have maintained the Penn State Computing with Accents page since 2000.

See Elizabeth Pyatt's Homepage (ejp10@psu.edu) for a profile.


The standard commenting utility has been disabled due to hungry spam. If you have a comment, please feel free to drop me a line at (ejp10@psu.edu).

Powered by Movable Type Pro

Recent Comments