Language Codes: November 2008 Archives

Language Tage "mo" for Moldovan Deprecated

|

As of November 3, 2008, both the ISO-639 language code mo (Moldovan) and the ISO-639-2 code mol (Moldovan) were deprecated in favor of Romanian.

In other words, the encoding standards authorities have embodied the notion that Moldovan, as spoken in the Republic of Moldavia, is actually so closely related to Romanian that they are both dialects of each other. This has been the stance claimed by the linguistic community and many elements in both the Romanian and Moldovan community.

From now on, the code ro(Romanian) will refer to the language forms used in both the countries of Romania and Moldova. The tags to distinguish linguistic forms in Romania from that of Moldova will be ro-RO (Romanian or Romania) and ro-MD (Romania of Moldavia).

This may seem to be a trivial change, but it's heartening from my point of view. In recent years, there had been a trend in language code assignments to favor political expedience over linguistic reality.

The most similar case was the elimination of the sh for Serbo-Croatian, as spoken in the former Yugoslavia in favor of three "separate" language codes for Serbian (sr), Croatian (hr) and Bosnian (bs). Although there are genuine regional differences between the forms (especially for Croatian), linguists still debate whether these forms are separate languages or dialects.

Although I do not expect the three codes for Serbian, Croatian and Bosnian to be eliminated anytime soon, I do think it's a good sign that speakers in Moldova and Romania were willing to re-evaluate their linguistic identity.

Categories:

Some Recent Language Tagging News (incl Pinyin/Wade-Giles)

|

Codes for language varieties are constantly being updated, but here is a list of some important changes that have happened in recent months.

The most up-to-date list is available at:
http://www.iana.org/assignments/language-subtag-registry

Chinese Romanizations

  • zh-Latn-pinyin for Pinyin Latin romanization (Mandarin)
  • zh-Latn-wadegile for Wade-Giles romanization (Mandarin)

Note that here the assumption is that zh is Mandarin Chinese. From the discussion it appears that more precise codes for Mandarin could not be used because they had not been fully-approved (sigh). If you are working with a "dialect", you may need to include an appropriate dialect/language extension.

Cornish Spelling

It's hard to believe that a language just being revived already has multiple competing spelling systems, but that's how it goes sometimes. The codes are:

  • kw for Cornish
  • kw-kkcor for Cornish, Common Cornish orthography
  • kw-uccor for Cornish, Unified Cornish orthography
  • kw-ucrcor for Cornish, Unified Cornish Revised orthography

Valencian

Valencian (Spain) is considered to be a regional dialect of Catalan or code ca-valencia.

Belarusian, 1959 spelling

The code be-1959acad is for "Academic (govermental) variant of Belarusian as codified in 1959.

Categories:

About The Blog

I am a Penn State technology specialist with a degree in linguistics and have maintained the Penn State Computing with Accents page since 2000.

See Elizabeth Pyatt's Homepage (ejp10@psu.edu) for a profile.

Comments

The standard commenting utility has been disabled due to hungry spam. If you have a comment, please feel free to drop me a line at (ejp10@psu.edu).

Powered by Movable Type Pro

Recent Comments