ELIZABETH J PYATT: September 2007 Archives

EndNote and Unicode Input


EndNote has supported Unicode since version 8, but I'm just now getting around to testing it. As long as you activate the appropriate keyboard, Unicode input appears to be fairly straightforward, at least on the Mac (and I assume the PC).

Just a font note - I ended up changing my default in the preferences from Helvetica to Lucida Grande. EndNote does change fonts with script, but Lucida Grande is legible and has a large set of Unicode characters. The equivalent font is ArialUnicode on the PC.

FYI - there may be some quirks for export. See the University of Sydney's EndNote documentation for details.

One nice feature they point out is the "Translated Author" and "Translated Title" field.


I worked on a math quiz in Math for set theory where questions are pulled into a text file within Flash. The instructor wants to include the union (∪) and intersection symbol (∩) in his problems, so what to do?

The good news is that if you can create a UTF-8 text file and insert the symbols, it will import into Flash (at least in Flash 8.) For math, your best bet is usually to use the Windows Character Map utility and insert the symbols into a Notepad text file or use the Macintosh Character Palette with a Text Edit or BBEdit text file. Unfortunately, the process is still a little clunky in both platforms, but it's better than in 2005.


You have to open both Notepad (Start » Accessories) and Character Map (Start » Accessories » System Tools)

For the Windows Character Map, it's a semi-clunky process. You have to switch the font to Arial Unicode MS (because it has the all the math symbols), then scroll down to window untul you see the math section. Then you have to select, copy and paste each symbol into Notepad.

In Notepad, when you save the file, you have to make sure the encoding menu under the file name is changed from "ANSI" to "UTF-8". Fortunately, it will warn you.


In Text Edit for the Mac, you go to Edit » Special Characters to bring up the Character Palette. Click the Math option and hunt for the symbol. Highlight and click Insert to place it in Text Edit.

Once you insert the symbols, you have to make sure your encoding is set to UTF-8 during the save process. Go to the Format menu and select "Make Plain Text." Then, when you save the file you have to make sure the encoding menu under the file name is changed from "MacRoman" to "UTF-8".

Reopening UTF-8 Files in Mac Text Edit

In Text Edit, if you reopen a UTF-8 file it may be magically transformed to MacRoman (you'll see things like Á& instead of your intended character). Very annoying (grr!!) To prevent this, you must go into the Text Edit Preferences, then click the Open and Save panel. Make sure that the Plain Text Encoding options for opening and saving are set for "UTF-8." Or you can spring for a license for BBEdit or Mellel which are better about warning you.

As for Flash - fonts are still a little tricky within Flash, but at least it's playing well with text files.

Superscript and Subscript

I also used Flash for a College Algebra quiz where I discovered that the XML format does not support HTML tags like <sup> and <sub>. Instead, you may need to use the Unicode characters for superscripts and subscripts.


Glyph Du Jour: Reversed Open E


This is a phonetic symbol I have seen before, but did not understand...until this semester. This is the vowel found in the standard RP British pronunciation of bird /bɜd/ (U.K.) or /br̩d/ in U.S. English.
image of backwards open e - it resembles a 3 or a backwards epsilon
Many accents in English, including UK English, have generally lost the /r/ after a vowel while standard American has maintained this /r/. What's interesting is that even though the /r/ is theoretically gone, there are still subtle changes in pronunciation where the /r/ used to be...which is why English speakers can still "hear" an /r/ even though it may not really be there any more.


The IPA Unicode Friendliness Test


When I'm doing an initial test to see if a product is Unicode friendly or not, I typically switch to my IPA keyboard and see if it will accept and display phonetic character input. Why this test?

The first reason is that I actually know my phonetic symbols and can type something pretty quickly. They're also a fairly straightforward Western type alphabet so there are minimal font display issues.

The second is that while developers may program specific support for East Asian, Cyrillic or Middle Eastern languages, they rarely build in IPA phonetic symbol support (unless the product is targeted towards linguists). So, if the product can handle phonetics, it's a very good sign that generalized Unicode support has been implemented.

Does it mean every script is equally supported? Probably not. The gotchas are usually RTL languages like Arabic and Hebrew and the dead scripts like Gothic and Linear B. But if you have IPA support, you probably also have basic support for Czech, Welsh, Chinese, Japanese, Korean, Russian and maybe Armenian and Georgian. That does cover a lot of territory believe it or not.


Google Maps: Officially Unicode Friendly


I've been playing around with adding markers to Google Maps and I can report that so far it appears to be pretty Unicode friendly. At least I was able to input phonetic characters and I've seen multiple character sets in use for different maps.

I wish all apps were this easy...


About The Blog

I am a Penn State technology specialist with a degree in linguistics and have maintained the Penn State Computing with Accents page since 2000.

See Elizabeth Pyatt's Homepage (ejp10@psu.edu) for a profile.


The standard commenting utility has been disabled due to hungry spam. If you have a comment, please feel free to drop me a line at (ejp10@psu.edu).

Powered by Movable Type Pro

Recent Comments