October 2010 Archives

ConScript Unicode Registry (CSUR) Got Klingon or Tolkein?


Contrary to rumors posted on Unicode discussion boards, Klingon is NOT part of the Unicode specs. It is however part of the ConScript Unicode Registry maintained by John Cowan and Michael Everson.

The CSUR is a place where "artificial scripts" (including those created for fictional languages) can be quasi-encoded into the Private Use Area (PUA) blocks of Unicode. These were created for the exact purpose of allowing different user communities to formulate a standard to exchange data among themselves with characters not currently in Unicode. This registry allows a community to assign numbers in the PUA to a script then publicize it. The nice thing is that you could begin something like a Tolkein or Star Trek script containing all the scripts of the fictional universe.

Scripts included in the registry include Klingon, Ferengi from Star Trek, Tengwar and Cirth from the Tolkein Lord of the Rings trilogy, scripts from the Ultima game and even Seussian Latin Extensions...plus a whole lot more scripts that quite frankly, I have never heard of.

Anyone using the Private Use Area should note that these assignments are not officially recognized by Unicode and that many groups use the PUA for their own purposes. For instance, Microsoft fonts include special math symbols in the PUA (some of which have Unicode points outside the PUA). Academic consortia also use the PUA to include extra characters (e.g. extra phonetic symbols, extra medieval abbreviations and so forth) not in Unicode.

Still it's a way to get some sort of standard enacted.

Unicode 6.0 Released


The revised Unicode standard version 6.0.0 has been officially released by the Unicode Consortium. In addition to changes in the specification, some additional characters have been added include a block of emoji (emoticon) symbols, the new rupee sign of India as well as new blocks for Mandaic (Iran), Batak (Indonesia) and Brahmi (ancestral form of most scripts of India). Additional characters have also been added for alchemical symbols and playing cards, and there have been additions for CJK ideographs, Ethiopic, Tifinagh and Bamum (Cameroon) scripts.

Hopefully the mainstream fonts and OS will catch up with some of these issues soon.


Correction: No Hangul in Indonesia


In Aug 2009, it was reported that Cia Cia, a minority language of Indonesia was planning to adopt Korean Hangul as its writing system ( Korean Script Heads to Indonesia). Unfortunately, it turns out that nothing official had happened yet, although specialists in Korea had been talking with the Cia Cia community. However, it appears likely that the Indonesian government would require that any indigenous language adopt the Roman alphabet (also used for modern Bahasa Indonesian, the official language of Indonesia).

Too bad - it was an interesting concept. There are other historic scripts in Indonesia besides the Roman alphabet, but Hangul may not be joining them in the near future.


About The Blog

I am a Penn State technology specialist with a degree in linguistics and have maintained the Penn State Computing with Accents page since 2000.

See Elizabeth Pyatt's Homepage (ejp10@psu.edu) for a profile.


The standard commenting utility has been disabled due to hungry spam. If you have a comment, please feel free to drop me a line at (ejp10@psu.edu).

Powered by Movable Type Pro

Recent Comments