Recently in Ancient Scripts Category

Windows 8 & Windows 8.1 Ancient Script and Asian Fonts

|

Scholars interested in ancient scripts such as Glagoltic, Gothic and Old Hangul may be interested in the new fonts packaged with Windows 8, in particular the updated Segoe UI Symbol font.

Or you could wait for Windows 8.1 when support for Coptic and different scripts of South and Southeast Asia will be added.

Categories:

Unicode 6.1 Additions

|

The Unicode standard was just updated to version 6.1, and that means new blocks and characters.

New Blocks

Blocks added included Miao (script developed for Hmong/Miao languages), Merotic Heiroglyphic & Merotic Cursive (adaptation of Egyptian heirogphys from ancient Meroë in what is now Northern Sudan) and multiple scripts from India (Sora Sompeng, Chakma, Sharada, Takri).

Two new blocks for the Arabic script were also added - Arabic Mathematical Symbols and Arabic Extended -A. Extensions for the Sundanese and Meetei Mayak scripts were also added.

New Characters

The Unicode Consortium has an index of which new characters have been added to different scripts.

Categories:

Hardcore Medieval Fonts + Bonus Glyph DuJour

|

The title says "hardcore", so you may be wondering how I am defining hardcore. Medieval specialists will know that manuscripts from that period contain a bewildering variety of abbreviations and special symbols which varied widely from region to region. And if you had to painstakingly write out documents with quill pens on parchement, you would look for shortcuts too (kind of like texters with only a numeric keypad look for shortcuts).

One medieval manuscript abbreviation still in use is the ampersand (&) which began life as a quick way to write Latin et 'and'. But there were plenty of others such as the LATIN CAPITAL LETTER K WITH STROKE AND DIAGONAL STROKE (U+A744) as shown in the image at the end of the entry.

A variety of medieval character have been recently added to the Latin Extended D block in Unicode (and more could be added according to the Medieval Font Unicode Initiative/MUFI )...but font support is still catching up. Even venerated older medieval Unicode fonts are missing the newer additions to Unicode. If you do want to display these characters, you may want to try these fonts:

Professional typographers and publishers may also want to investigate Andron. In addition to the characters mentioned, these fonts will also include Greek, Cyrillic and phonetic characters.

If you want to find updates about Unicode fonts, I would recommend tracking MUFI, which is an academic consortium focusing on Unicode for medieval texts, including the development of appropriate fonts. Unfortunately, you may not be able to quickly find the original usage of LATIN CAPITAL LETTER K WITH STROKE AND DIAGONAL STROKE. A puzzle for another time...

Glyph Sample K with lines through each leg

Categories:

Enter Plane 1 (Phonecian/Linear B...) on Mac Unicode Hex Keyboard

|

A useful utility on the Mac is the Unicode Hex keyboard which allows you to press Option plus any four digit Unicode code to get that character.

For instance, if you need to enter the rarely seen archaic Roman numeral symbol for 5,000 (), you could look up its Unicode character number (U+2181), then activate this keyboard then type Option+2181 and generate the code (assuming the correct font is loaded).

But a lot of ancient scripts are in Plane 1, meaning they have Unicode values with five digits (i.e. U+10000 or higher). In Unicode world, adding the fifth digit means that some processes go slightly awry, and the Unicode Hex keyboard is one of them. Suppose I want to input Phonecian character Alf (Aleph) (𐤀 or an A on its side), which is U+10900. If I enter Option+10900 on the Unicode Hex keyboard, I will not get Alf, but ႐ instead.

Note: Characters U+0000 to U+FFFF are in Plane 0 or the BMP (Basic Multilingual Plane). A lot of systems are set up to deal with BMP only, but need special support for codes beyond U+FFFF. The four-digit restriction corresponds to 16-bytes which a constraint in older systems. If you're not a programmer, let's just say it's a long story and leave it at that.

It turns out that the Unicode Hex keyboard has a four-digit limit. To get around it, you can break U+10900 into two 16-byte (i.e. 4-digit) sequences, also known as as a UTF-16 Surrogate Pair. For U+10900, the surrogate pair is D802+DD0C. So in the Unicode Hex utility, you can now do this.

  1. Hold down the Option key.
  2. Type D802+DD0C, where the + means type the Plus sign.
  3. Release the Option key.

I bet you're asking - how did she get from U+10900 to D802+DD0C? There is an algorithm, but in this case I got it by opening the Character Palette, finding the character I wanted and mousing over it. When you do that, the Unicode code point appears along with its surrogate pair in parentheses.

Of course, you could also directly Insert the character with the palette, but actually there are times when the Insert doesn't quite work (at some points in the careers of my laptops, I have corrupted my Character Palette so badly, it refused to play with me anymore).

Although this utility seems a little limited at the moment, if there's one thing I have learned is that Unicode no trick has ever gone to waste.

Categories:

Some New LGC Fonts

|

I was checking the font repositories and found some new fonts that might be of interest to the linguistics/medieval/math crowd. But before that, I would like to define a new term LGC = Latin/Greek/Cyrillic font which refers to any font which includes the Latin, Latin-A, Cyrillic and Greek and a few math symbols. So many fonts include all three blocks, that's a handy acronym for me.

One caveat is that Basic LGC fonts don't necessarily include ALL LGC characters. For instance a font like Verdana may be missing IPA extensions, Cyrillic extensions and Greek extensions. The good news is that more fonts including the special characters are becoming available, and we're getting freeware large fonts to fill in typographical needs like small caps and narrow characters.

  • Arev Sans - A sans serif font with excellent LGC coverage including Latin/Greek/Cyrililc extensions, a good inventory of math symbols and other symbols/punctuation.
  • Linux Libertine - A family of OTF fonts with separate fonts for bold, italics, small caps. Good LGC coverage. It's also good to have a small caps font for Greek and Cyrillic, but it seems to be missing some of the phonetic characters.
  • Marin Font - This font is notable for being a little narrower than others which is a nice change and has glyphs for the Cherokee block and the Canadian Aboriginal Syllables. It also includes a separate Small Caps font.
  • Roman Cyrillic Std, BukyVede, KlimentStd from Kodeks German Medieval Slavicists Server - Bukyvede in particular includes a lot of historical Cyrillic characters and includes the Glagoltic characters. Kliment and Roman Cyrillic are LGC fonts which include other variations of the Glagoltic block. Latin and Greek are also included
  • Quivira - I discussed this a few entries ago, but to repeat: Big font. Lots of scripts including LGC, Coptic, Armenian, Hebrew, Georgian, Thai, Baybayin, Runic, Thai, Braille, some Indic...
  • Sophia Nubian - a new Coptic and Nubian script font from SIL with Keyman keyboard utility (Windows). A Mac Coptic Unicode Keyboard is also available.

I should mention that SIL is an excellent source of freeware fonts for undersupported scripts. Here's a list of the SIL fonts.

There are always more fonts out there so I recommend a periodic check of Gallery of Unicode Fonts and Alan Wood's Font list periodically. You never know what you might find.

Categories:

Got Coptic?

|

I was trying to learn more about how the Coptic alphabet interacted with Unicode, and although each and every script has it's own story, I was surprised at how tricky Coptic is with respect to Unicode. Coptic is a left-to-right alphabet with minimal spacing issues - much like the Latin, Cyrillic and Greek alphabets. If you can get Gothic online, Coptic should be easy right? Not necessarily...

Some Things You Should Know About Coptic and Unicode

Such as...

1. There is an old Coptic block and a new Coptic block

You may already know that the Coptic alphabet is an adaptation of the Greek alphabet as used in late Ptolomeic, Roman Egypt. I think most Unicode aficionados know that there is an old Coptic block (containing just the letters adapted from Demotic Egyption script), and a new Coptic block (everything).

At one point the Unicode community was treating Coptic as a variant of the Greek alphabet with a few extra letters, but later it was decided to separate Greek and Coptic completely, so the new block was created in just the past few years.

2. The old Coptic block didn't go away

As far as I can tell, the Demotic characters were not assigned new numbers, but were left as part of the Greek block. A complete Coptic alphabet is pulling from both blocks.

This was kind of a surprise since many Coptic charts just show the new block and miss the Demotic letters altogether.

3. New Coptic fonts and utilities are available

The new Coptic block is old enough for the academic and other developers to catch up. Here's my current list. By the way, I also recommend Quivira and MPH2B Damase as general purpose linguistic fonts - they do cover a lot of blocks.

Coptic Fonts

The following freeware fonts are available for both Windows and Mac:

Coptic Computing and Keyboards

4. Browsers generally choke on Coptic (except Firefox 3 and Safari)

I uploaded my fonts, and checked my new chart on Safari (which is fine, but not always a Unicode superstar in my opinion). Everything worked there, but when I checked my chart in Firefox 2, all I got were the Unicode question marks of death (Whoa)

The same also happened in IE 7 and Opera. Not a pleasant surprise. For the record, I was able to get Opera and Firefox 2 (Windows) to display Coptic if I made a font with Coptic the generic default (hence my recommendation for Quivira). I was not able to get either Firefox 2 for Mac or IE 7 to display Coptic (and I did see some other forum messages indicating similar issues).

The good news is that the Coptic did encourage me to upgrade to Firefox 3, and there everything is fine - no font tweaks needed.

As I said earlier I am mystified by this because Coptic is not particularly unusual as far as Unicode blocks go. But it is working in some browsers now.

So that was my adventure with Coptic. Someday I hope I may get to use it in a real textual or linguistic application, but at least I know that I was able to update to Firefox 3 and not lose all of my other plugins!

Categories:

What's New in Unicode 5.1?

|

Unicode version 5.1 was recently released, and includes some new code blocks as well as new specifications. As with all new versions of Unicode there will be a time lag until the new items can be incorporated into fonts and utilities, but here is a partial list of new items

If you're interested in the new characters, the best place to view them is at http://www.unicode.org/charts/

New Plane 0 Scripts

  • Cham (Cambodia/Vietnam)
  • Kayah Li (Thailand/Myanmar)
  • Lepcha (India)
  • Ol Chiki/Santali (India)
  • Rejang (indonesia)
  • Saurashtra (India)
  • Sundanese (Indonesia)
  • Vai (Liberia)

Script Extensions

These blocks add characters to previously encoded scripts.

  • Cyrillic Extended-A
  • Cyrillic Extended-B
  • Arabic - characters for math, 4 Qu'ranic and multiple characters for different languages
  • Indic - Malayalam, Tamil character sequences, Devanagari chandra a,
    Sanskrit sounds in Gurmukhi, Oriya, Telegu
  • Latin - characters for minority languages and capital German sharp S (rare)
  • Math Symbols
  • Medievalist Punctuation - for research
  • Myanmar Additions

New Plane 1 Ancient Scripts and Miscellaneous Symbols

  • Carian (Anatolia/Turkey)
  • Lycian (Anatolia/Turkey)
  • Lydian (Anatolia/Turkey)
  • Phaistos Disk (Crete)
  • Domino Tile Symbols
  • Mahjong Tile Symbols

Categories:

Some Ancient Script Mega Fonts

|

If you've achieved total script geekdom, then you especially want fonts which support ancient scripts as well as the modern ones. Unicode has been expanding its ancient script coverage, and fonts have been catching up in the past year or two.

Some of my favorites include:

MPH 2B Damase - available free from Gallery of Unicode Fonts. It includes many scripts including the Aegean scripts, Phoenician, many cuneiform scripts, Glagolitic, and more.

Aegean, Akkadian, Unicode Symbols - These fonts and others are freeware fonts from George Douros. Just pick the ones you want. Note - he has a heiroglyphic font if you need it, but it's not Unicode compliant (Unicode is still working on this script)

Alphabetum Unicode - This font from Juan José Marcos comes highly recommended, but it does cost $15.

Code 2001 - From James Kass. Technically it's still in beta, but that doesn't appear to concern anyone much.

Categories:

About The Blog

I am a Penn State technology specialist with a degree in linguistics and have maintained the Penn State Computing with Accents page since 2000.

See Elizabeth Pyatt's Homepage (ejp10@psu.edu) for a profile.

Comments

The standard commenting utility has been disabled due to hungry spam. If you have a comment, please feel free to drop me a line at (ejp10@psu.edu).

Powered by Movable Type Pro

Recent Comments