Explaining and Inventing Your Own Unicode Jargon - Part 2

|

Two entries ago, I extrapolated what would happen to encoding jargon in the Star Trek universe, mostly an exercise to explain how internationalization (i18n) is structured. In this installment, I hope to demonstrate how things only get more complicated when local encodings meet each other.

Starting "Local" Standards

In the new frontier of "interplanetarization (i19n)", we'll already be starting with a buffet of alphanumeric terms - namely the encoding standard(s) each planetary system. I'll repeat some below. Notice that the Orions still have two competing standards.

  • TUTF-32 - Terran Unicode (32 bit)
  • TLHLSCII - tlhIngan Hol (Klingon) Language Institute Standard Code for Information Exchange
  • RIS-105 - Romulan Imperial Standard #105
  • VSAUS-210A - Vulcan Science Academy Unified Standard #210A
  • ACS34 - Andorian Communication Standard #34
  • TelSCII - Tellarite Standard Code for Information Interchange
  • OTLC-10 - Orion Technology Limited Code #10
  • SuperSix - As agreed upon by six major Orion Trading Houses

Before They Create Fedcode

I would assume that eventually the Federation will eventually develop a really large unified standard similar to Unicode. I will call this Fedcode. However...the development of Fedcode will take a while and may even present new challenges in how many bytes are needed for each character.

In the meantime, the local computing systems will need a way to exchange information quickly, so I extrapolotate that lot of adhoc encodings will take place first. Such as:

What the Terrans may Invent

Similar to the Vulcans, I think Unicode will try to incorporate the new scripts into Unicode. At version 9.2, Unicode had 16 planes which was enough to accomodate the new Terran scripts, but finding new historical scripts will really add to the complexity.

Unicode 10, might have to add another layer (a "dimension"?). In this scheme, Dimension 0 will be the Unicode we now have, and then we would add

  • Unicode 10, Dimension 0 (= today's Unicode)
  • Unicode 10, Dimension 1 (= VSAUS-210A )
  • Unicode 10, Dimension 2 (= TLHLSCII)
  • Unicode 10, Dimension 3 (= OTLC10, not SuperSix)
  • ...

What the Vulcans Might Invent

  • VSAUS-210A -1 (All Vulcan scripts)
  • VSAUS-210A -2 (Basic Vulcan plus Andorrian scripts, based on ACS34)
  • VSAUS-210A -3 (Basic Vulcan plus Tellerite scripts, based on TelSCII)
  • VSAUS-210A -4 (Basic Vulcan plus Klingon scripts, based on TLHLSCII )
  • VSAUS-210A -5 (Basic Vulcan plus Orion scripts, based on SuperSix, not OTLC-10)
  • VSAUS-210A -6 (Basic Vulcan plus Terran scripts, based on Unicode 9.2)

Again, the 1 through 6 are referring to blocks/planes/dimensions in VSAUS-210A; it's just that the Vulcan encoding allows users to specify location in the scheme to facilitate their processing.

What the Orions Might Invent

Let's skip the Klingons and the Andorrians and jump to the worst case scenario - the Orions whose two encodings are developed by competing technology corporate interests. Each vendor/trading house will expand their encodings, but in different directions

Thus we will have:

  • OTLC-10 (Orion/all Orion measurements) - 16 bit for rapid processing
  • OTLC-11 (Vulcan)
  • OTLC-12 (Terran Unicode Plane 0)

As well as

  • SuperSix (Orion) - 64bit for "exact recording"
  • SuperSixV - Orion plus Vulcan
  • SuperSixT - Orion plus Unicode Plane 0
  • SuperSixPlus - Combines all scripts

By Fedcode

As you can see that by the time the Federation i19n experts meet for the first time to standardize Fedcode, there will not only be local planetary standards to work with but competing "combined" standards such as Unicode 10.5, SuperSixPlus and VSAUS-210A.

Which will become the basis of Fedcode? How will they plan for expansion for new scripts encountered?

And most of all - how will future computers handle the transformation between Fedcode and KDS (Cardassian Processing Standard)?

About The Blog

I am a Penn State technology specialist with a degree in linguistics and have maintained the Penn State Computing with Accents page since 2000.

See Elizabeth Pyatt's Homepage (ejp10@psu.edu) for a profile.

Comments

The standard commenting utility has been disabled due to hungry spam. If you have a comment, please feel free to drop me a line at (ejp10@psu.edu).

Powered by Movable Type Pro

Recent Comments