Some Recent Language Tagging News (incl Pinyin/Wade-Giles)


Codes for language varieties are constantly being updated, but here is a list of some important changes that have happened in recent months.

The most up-to-date list is available at:

Chinese Romanizations

  • zh-Latn-pinyin for Pinyin Latin romanization (Mandarin)
  • zh-Latn-wadegile for Wade-Giles romanization (Mandarin)

Note that here the assumption is that zh is Mandarin Chinese. From the discussion it appears that more precise codes for Mandarin could not be used because they had not been fully-approved (sigh). If you are working with a "dialect", you may need to include an appropriate dialect/language extension.

Cornish Spelling

It's hard to believe that a language just being revived already has multiple competing spelling systems, but that's how it goes sometimes. The codes are:

  • kw for Cornish
  • kw-kkcor for Cornish, Common Cornish orthography
  • kw-uccor for Cornish, Unified Cornish orthography
  • kw-ucrcor for Cornish, Unified Cornish Revised orthography


Valencian (Spain) is considered to be a regional dialect of Catalan or code ca-valencia.

Belarusian, 1959 spelling

The code be-1959acad is for "Academic (govermental) variant of Belarusian as codified in 1959.

About The Blog

I am a Penn State technology specialist with a degree in linguistics and have maintained the Penn State Computing with Accents page since 2000.

See Elizabeth Pyatt's Homepage ( for a profile.


The standard commenting utility has been disabled due to hungry spam. If you have a comment, please feel free to drop me a line at (

Powered by Movable Type Pro

Recent Comments