Recently in Accents & Punctuation Category

Russian Ruble Symbol Coming to Unicode

|

A new Russian ruble sign as just been approved by the Central Bank of Russia. The sign is a traditional Cyrillic R (Р) with a crossed line below. You can also see other recent design candidates if interested.

The next discussion of course will be where in Unicode this will appear. Some have proposed that it will be U+0554, but that is the Armenian letter keh (Ք). Although the appearance is similar, there is a discussion online already of whether this is the best idea to transmit Armenian /k/ as a currency symbol.

Based on previous patterns, I predict that a new code will be assigned, perhaps in the Currency block (U+20BB?) or possibly the Cyrillic block. If it's in the Cyrillic block, it would a new addition to the recent Indian Rupee Symbol (U+20B9/₹) and Turkish Lira sign (U+20BA/₺).

From a sociolinguistic perspective, I am finding the creation of new currency symbols interesting, especially for currencies which have existed as long as the ruble and the rupee. To me this is a clear extension of idea that a language doesn't socially exist as a "real language" unless it has its own spelling/writing system.

Apparently it's now equally important for governments to establish a unique currency sign to be counted as a "major" currency.

Postscript: 16 Dec 2013

The debate about whether the Russian Ruble sign should be in the Armenian block has reached the Armenpress news wire. Many are recommending no.

Categories:

Embedded Fonts htaccess Update

|

htaccess Fix

Since Penn State just rolled out its Word Press service, I thought I experiment with font embedding with SIL webfont kits I uploaded into one of my directories on another server. The good news is that it works, but I had to adjust my .htacess file to allow Firefox to process the files.

The code that worked for me was:

<FilesMatch "\.(ttf|otf|eot|woff)$">
  <IfModule mod_headers.c>
    Header set Access-Control-Allow-Origin "*"
  </IfModule>
</FilesMatch>

Thanks Stack Overflow and the Sites Team for their help.

Correction - Not on Firefox for Mac

I thought it was working on Firefox for Mac, but something broke. Oy. It is OK on Firefox for PC.

Google Fonts not Just English

I was also happy to find that the Google Fonts options have expanded to Greek, Cyrillic and extended Latin. Just click the Script to filter for the appropriate fonts. There are some choices if you don't want to mess with your .htaccess file.

Categories:

Icon Fonts + Unicode and Accessibility

|

Remember the old Symbol font in which you could type S and get the Greek sigma (Σ) symbol....without activating any Greek keyboard? Or Wingdings font which produced all number of cool symbols. Unicode gurus have been trying to get rid of these fonts for a while now since you had issues where different symbols could appear depending on which font a user had installed.

For instance, the following passage will either be a series of astrological symbols or the letters A-E depending on whether you have the Wingdings installed or not (and whether your browser will let you get away with displaying Wingdings).

a b c d e

Web Fonts to the Rescue....Sort Of

This situation had forced Web developers to retire the use of Symbols and Wingdings for the Web....until the advent of the embedded Web font which allows a Web site to send a font file to any user. Now everyone will have your font.

This has led to the rise of new "icon fonts" which can be embedded on a Website. Some of my favorites are

In theory, you can get rid of a lot of icon images and use more lightweight text, and since the font will download to everyone, you no longer have to worry about display issues or even encoding. You can even use modern CSS to perform all sorts of formatting tricks. Great news .... right?

Encoding Side Note

Generally speaking icon fonts are taking two approaches. One is the 1990s "Unicode...what Unicode?" approach. If you download and use the StateFace font, typing w will output the outline of West Virginia.

The second approach is to try to conform to Unicode. Glyphs that do have a code (e.g. a smiling face) are put in that slot and others are put into the Private Use Area. Obviously, this approach has many advantages over the original approach in that some text integrity can be maintained.

Accessibility and Other

As you might expect, there is trouble in paradise. A major potential gotcha is screen reader access because screen readers don't do fonts, only underlying text. VoiceOver recognizes many Unicode symbols, but if a font is using a symbol not in Unicode, then you're out of luck. Similarly, if a font is matching a symbol to a letter a la Wingdings, then VoiceOver will read the letter and not the symbol.

Even worse is JAWS which has very limited symbol recognition by default. If you replace letters with symbols, JAWS will read letters. But even if you use a properly encoded font, JAWS may still not read your symbol (whereas any image with ALT text is recognised) especially those in the Private Use Area. In this case, you may need to provide a JAWS symbol text file...unless you are crafty in your use of icons.

But don't forget mobile devices. Some phones may be able to recognize embedded fonts, but some older phones may not. As recent reports are showing, more and more people are accessing the Web primarily through a phone or tablet device.

Best Practices

I think there are some tricks that can help you have your icons and still have accessible content. The key is to avoid using an non-standard icon alone in text.

  1. Try to keep icon use decorative by adding a text equivalent. Icons used as a decorative element really don't an ALT tag anyway, so being skipped by a screen reader will not be too disruptive. You can also work around bad encoding issues also.

  2. Use font to generate an image. If an icon use being used alone, then an image with ALT tag may still work. BUT an icon font can give you a head start by providing a nice set of vector based images to work with. Sweet!

  3. OR...you can re-add a text equivalent, but hide it from sighted users. This is a sneaky way to add back in ALT text for screen readers.

Categories:

MathML Test on MovableType

|

If you're on Firefox 4+, Safari 5+ or Internet Explorer 9 with MathType Player 3, the text below with be a MathML representation of Planck's Law.

If you want to replicate this you have to:

  1. Paste the XML in the HTML code (i.e. NOT the WYSIWYG editor)
  2. Make sure that the first line of the XML includes a link to the MathML namespace as follows:
    <math xmlns="http://www.w3.org/1998/Math/MathML">
  3. I also like to use CSS to bump the font size - those super/subscripts can get very tiny.

All I can say is - Wow. I wish all my CMS systems played this well with MathML.

E λ b = 2 π ℎc 2 λ 5 e ℎc λ k b T 1

P.S. If you send me comments on this blog, please note that this text may possibly depart from standard academic English. Linguists can do that, especially in a blog.

Categories:

JAWS 13 and Phonetic Symbols

|

A a linguist, I work with lots of exotic symbols, but only a small percentage of them are recognized by the standard U.S. of JAWS. If you work with phonetic symbols like /ə, ʃ,ʒ,ɰ/ you will need to tweak your pronunciation files.

I wrote about this in an earlier post on JAWS 6, but today I was able to document and implement, so I thought I would share the procedure.

The fix I am using will expand the symbol set within JAWS so that a character like /ə/ will be read as "schwa" (but not as its phonetic value of "uh") Ideally, it would be nice to have a word pronunciation engine so that phonetic pronunciation values are emulated, but let's take this one problem at a time.

SBL Files

JAWS includes a set of symbol or .sbl files which match punctuation and symbol characters with a "word" (e.g, ? = "question mark"). The key is to add the character and reading to your working files.

Luckily, there there is a phonetic symbol .sbl file from Robert Englebretson. There's also a math symbol .sbl file from Carroll Tech.

Add Characters to Symbol File

This procedure assumes that JAWS is using the Eloquence engine, in which case the key file to change is eloq.sbl. You will also need to have an Admin account to implement the changes.

Note: SBL files can be opened in any text editor such as Notepad.

  1. Open or download phonetic symbol .sbl file (New Window)
  2. Find the location of your eloq.sbl file. Mine was in the the following path on my C hard drive:
    C:\Users\All Users\Freedom Scientific\Jaws\13.0\Settings\enu\eloq.sbl
  3. Make a (second) copy of this file and rename as eloqOld.sbl. This is your backup in case something goes wrong.
  4. Make a third copy and rename it as eloqNew.sbl. This is a temporary file to edit since you may not be able to directly edit eloq.sbl.
  5. Open eloqNew.sbl in a text editor such as Notepad. This file contains pronunciation values for multiple languages. Scroll to the language you normally use (e.g. "[American English]"
  6. Scroll to the end of the symbol list for that language.
  7. Copy and paste the list of symbols from one of the other .sbl files immediately after the final line in the list. Each symbol will be in a single line and have the format U+0001=character name
    Note: Don't worry if the format does not match the rest of the symbol list.

  8. Repeat the last step for each language you want to support. You can translate character names as needed for each language. Save and close file.
  9. Exit JAWS if it is open.
  10. Delete eloq.sbl. You may be asked for an admin password at this point.
  11. Rename eloqNew.sbl as eloq.sbl.
  12. Restart JAWS and test on a page such as IPA Characters based on Letter A with Numeric Codes

Look Up Additional Codes

Each line in the SBL file has this format:

U+Codepoint=Character Name (no quotes)

For instance, if I wanted to expand the repertoire of currency symbols to include the new rupee symbol of India (₹), I would add the following to my .sbl file

U+20B9=Rupee symbol of India

A list of Unicode charts with code points is available at http://www.unicode.org/charts

Categories:

Testing Some MP3 Sites with Halfaxa Titles

|

Unicode is such an esoteric subject, you sometimes wonder who's seeing the possibilities. One artist who does appreciate is Canadian electronic musician Grimes whose album Halfaxa contains song titles such as "ΔΔΔΔRasikΔΔΔΔ", "Sagrad Прекрасный", "† River †", along with the charmingly titled "World♡Princess" and the mathematically complex "≈Ω≈ω≈ω≈ω≈ω≈ω≈ω≈ω≈" (Αlmost Omega?)

That makes this album a great test case to check out how well your MP3 or streaming service does with Unicode. As you can see below, iTunes and Rhapsody do well, but for some reason Amazon is giving me the Unicode question mark of death (my guess it's because the page specfies Verdana which doesn't have all the characters).

I haven't tested every music site, but you get the idea...

iTunes Halfaxa List

Halfaxa album list on iTunes with correct symbols

Rhapsody Halfaxa List

Halfaxa album list on Halfaxa with correct symbols

Amazon Halfaxa List (Verdana Type)

Halfaxa album list on Amazon with ?? for symbols

Categories:

Math+HTML 5 in 3 Browsers

|

As you can see I haven't been posting here regularly. It's because I've been tied up with a11y (accessibility) including MathML.

However, I am happy to report that I was able to create a HTML5+MathML file that works in Internet Explorer AND Firefox/Safari (with some Unicode thrown in).

As a reward, I think I will write a Unicode post today.

Categories:

Free ErlerDingbats Unicode Font for 2700 Block

|

If you've ever wanted Unicode support for snowflakes, decprative arrows, crosses and stars, then you may be interested in the free Erler Dingbats font from the Font Shop. The fonts even ship with keyboard layouts to make data entry easier

The image below shows roughly the glyphs covered (generally in black and white). There are more characters covered in the for-fee font DD Dingbats 2.0, but even these provide some interesting possibilities in terms of documentation and even fancy bullet lists (especially if combined with font embedding.

Unicode Block UTF+2700-27BF

Categories:

Got Double Hypens from Word?

|

Unicode hasn't been part of my life enough recently, but it did emerge in a very unexpected way this week to during a recent calendar upgrade.

One of the conversion tasks was for us to add group e-mail addresses so we could share calendars among each other efficiently. But when I tried to copy and paste, I got a "not found error." Here is one of these addresses (altered for security reasons):

umg-sc.foo.staff@fuyu.ucal.psu.edu

Can you spot the problem (HINT: Try cutting and pasting into a text file).

Given up? The problem is the hyphen. In the right font, you will see that it's not just a hyphen (U+002D or ASCII #45), but actually the more elegant and slightly longer en dash which is actually U+2013 (not in ASCII). As many of you know, many databases are still sensitive to differences, so a hyphen is just not the same as an en dash. Theis means searching is a FAIL.

How did the en-dash get in there if it's outside of ASCII? My guess is that it's a result of an auto-correct feature from Word which makes some formatting tweaks to enhance visual appeal. One is to change plain hyphens into a slightly longer en-dash (more favored by typographers).

Another common change is to convert plain straight quotes (" at U+0022 or ASCII #34) to "Smart Quotes" like (“ at U+201C) and (” at U+201D). Copying HTML code attributes from Word can be similarly dangerous since HTML recognizes plain quotes, but NOT fancy double quotes. Most of the time, the change does nothing, but when it comes to interacting with some systems, the reformatting makes a difference in a very annoying way.

How to catch it? In some cases, you can change the font, but many fonts make the dash and en-dash appear identical (Arggh!). Which leaves the old standdy (test,test,test) plus some Unicode awareness (which is increasing among programmers).

Categories:

i18n Enhancements Announced for Mac OSX 10.7 (Lion)

|

They're kind of scattered, but it looks like the next version of Mac OSX will be bringing lots of good enhancements for those working outside of English.

Asian Fonts and Text Input

Support for many scripts from South Asia has been lagging behind Windows, so I am personally pleased to see fonts for Bengali, Kannada, Malayalam, Oriya, Telugu and Sinhala being added (especially since I took 12 credits of Sinhala back in the day). New fonts for Tamil, Devanagari, Gujarati and Urdu are also scheduled to be added as well as for Lao, Khmer and Myanmar.

Those working with East Asian languages should be able to access improved utilities for Chinese (filtering by tones, ordering radical/stroke), Japanese Kotoeri and Vietnamese (old and new orthography). The Chinese handwriting recognition software is also scheduled to include more support for Simplified Chines and Roman characters. Finally, Apple announced that Lion will support vertical text (typing and display)

Everyone will also be able to a new color emoji font.

In Safari

Improvements for Safari included:

  • Math ML support in Safari
  • Improved CSS3 support including vertical text, East Asian emphasis, auto hyphenation

Non-English Accessibility

Accessibility options for those not using 100% English are not available include Voiceover speech in 23 languages and expanded Braille options.

Categories:

About The Blog

I am a Penn State technology specialist with a degree in linguistics and have maintained the Penn State Computing with Accents page since 2000.

See Elizabeth Pyatt's Homepage (ejp10@psu.edu) for a profile.

Comments

The standard commenting utility has been disabled due to hungry spam. If you have a comment, please feel free to drop me a line at (ejp10@psu.edu).

Powered by Movable Type Pro

Recent Comments