Encoding Theory: April 2009 Archives

Some Recommended Books


Although the vast majority of my Unicode knowledge has come courtesy of the Internet, there are some print resources that I am beginning to find very useful, so I thought I would add some quick notes. I would add that audience for the Unicode books is generally the programmer audience needing implement Unicode support. These books really don't tell you how to type an accented letter.

Unicode Demystified

If you're a programmer who's been handed a foreign language project and really aren't sure where to go next, I think this is a good place to start. This book by Richard Gillam dates from 2002, but is still a valuable resource because it explains the basic concepts behind Unicode in fairly straightforward language.

This book covers the major world scripts including Latin, Cyrillic, Greek, Arabic, Hebrew, East Asian scripts, major South Asian scripts, Cherokee, Canadian Aboriginal and so forth. These scripts generally cover most of the major typographical and sorting issues you are likely to encounter, so it remains very handy for the newcomer. However, If your script is a little more exotic (or newer to Unicode), you will probably need to find alternate resources.


Unicode Demystified: A Practical Programmer's Guide to the Encoding Standard (Paperback)

Author: Richard Gillam
Year: 2002
ISBN (10/13): 0201700522 / 978-0201700527

Unicode Explained

This is from 2006, so it's more recent, and it's by Jukka Korpela who is good at explaining concepts behind encoding (as well as accessibility). Unfortunately, I haven't had a chance to acquire it yet. I will be looking forward to taking a look at this.


Unicode Explained (Paperback)

Author: Jukka Korpela
Year: 2002
ISBN (10/13): 059610121X / 978-0596101213

The Unicode Standard

For each of the major Unicode Standards (e.g. 4.0, 5.0), the Unicode Consortium releases a hard-bound reference of the actual standard. If you're semi-serious about Unicode programming, I would recommend picking up at least one version of the standard and then updating over time. It does gather everything in one place...at least for the moment.

The first part explains the standard including issues of direction (LTR/RTL), casing, ligature, different flavors and so forth. There is also an explanation for each script. The last section prints the character list block by block, including the East Asian CJK characters which are normally referenced with just a database online.

I think the reference aspect is the most important benefit of this book. Although there are sections for each script, this work tends to assume that you are fairly familiar with whatever script you are with and so devotes most of the text to technical explanations. Fortunately, I think the technical explanations and examples are core examples that a programmer would need.

Although most of the content is replicated in PDF on the Web site, it can be handy to have the actual book as a baseline reference. For one thing, the charts are of high quality print, allowing you to see minute typographic details. For another thing, you never know where you will need to work on a project without the Internet....


Unicode Standard, Version 5.0, The (5th Edition) (Hardcover)

Author: Unicode Consortium
Year: 2006
ISBN (10/13): 0321480910 / 978-0321480910


About The Blog

I am a Penn State technology specialist with a degree in linguistics and have maintained the Penn State Computing with Accents page since 2000.

See Elizabeth Pyatt's Homepage (ejp10@psu.edu) for a profile.


The standard commenting utility has been disabled due to hungry spam. If you have a comment, please feel free to drop me a line at (ejp10@psu.edu).

Recent Posts

Powered by Movable Type Pro

Recent Comments