Recently in Central European Category

Quivira Unicode Font


I just discovered a new large True-Type Unicode font called Quivira from a German developer. It is based somewhat on Garamond, and includes a lot of useful characters such as Latin, Phonetics, Math, Greek, Coptic, Cyrillic, Cherokee, Currency, Box/Geometrics/Arrows, Old Italic, Gothic, Braille, Armenian, Hebrew and so forth.

The site is in German, but there's enough information for a user to get by using "Internet German", and as the author says "Quivira ist Freeware."

List of Characters (PDF):


Language Tage "mo" for Moldovan Deprecated


As of November 3, 2008, both the ISO-639 language code mo (Moldovan) and the ISO-639-2 code mol (Moldovan) were deprecated in favor of Romanian.

In other words, the encoding standards authorities have embodied the notion that Moldovan, as spoken in the Republic of Moldavia, is actually so closely related to Romanian that they are both dialects of each other. This has been the stance claimed by the linguistic community and many elements in both the Romanian and Moldovan community.

From now on, the code ro(Romanian) will refer to the language forms used in both the countries of Romania and Moldova. The tags to distinguish linguistic forms in Romania from that of Moldova will be ro-RO (Romanian or Romania) and ro-MD (Romania of Moldavia).

This may seem to be a trivial change, but it's heartening from my point of view. In recent years, there had been a trend in language code assignments to favor political expedience over linguistic reality.

The most similar case was the elimination of the sh for Serbo-Croatian, as spoken in the former Yugoslavia in favor of three "separate" language codes for Serbian (sr), Croatian (hr) and Bosnian (bs). Although there are genuine regional differences between the forms (especially for Croatian), linguists still debate whether these forms are separate languages or dialects.

Although I do not expect the three codes for Serbian, Croatian and Bosnian to be eliminated anytime soon, I do think it's a good sign that speakers in Moldova and Romania were willing to re-evaluate their linguistic identity.


Getting the ř of Dvořák


If you've been visiting accent code pages looking for the hachek R (ř) found in the famous Czech's composer's name...chances are it's not there. That's because the tables only cover those accented letters found in Western European languages, or in Unicode terms, accented characters with code points #0-255. You can see the Penn State Encoding Tutorial if you want the full details.

If the code point is over 255, you have to switch to a new method of inputting things. As it happens the "exotic" accented letters in Central European languages like Czech, Polish, Hungarian, Croatian, Serbian and Slovenian are all over 255. So this means...


If you're just typing in a few names like Dvořák, then the Windows Character Map will let you insert the characters above 255. The ř character is in the Latin Extended A range. Note: There are numeric ALT codes, but they don't work in all applications.

On the other hand, if you're typing text in Czech or other Central European language, then it's probably better to activate the appropriate language keyboard which lets you type the accented letters directly from the keyboard.


I personally recommend the Extended Keyboard because you can type a wider range of accented letters. I wish Windows had one of these... (maybe on Vista?)

However, you could also activate the Character Palette or the specific language keyboard depending on your needs.

As you can see, not all Unicode code points are equal (especially in the U.S. market).


About The Blog

I am a Penn State technology specialist with a degree in linguistics and have maintained the Penn State Computing with Accents page since 2000.

See Elizabeth Pyatt's Homepage ( for a profile.


The standard commenting utility has been disabled due to hungry spam. If you have a comment, please feel free to drop me a line at (

Powered by Movable Type Pro

Recent Comments