Accents & Punctuation: August 2007 Archives

Vulgar Fractions in Unicode

|

I've gotten some messages recently for the TLT International Page asking why I did not have codes for fractions (e.g. ½ or 1/2 vs. 1/2) listed, so I did some experimentation and self reflection.

In the end, I think the fraction codes are interesting, but not generally needed. If you do need fractions to be formatted for typography purposes, CSS is actually the best solution for formatting most simple fractions.

THE TERM VULGAR FRACTION

Actually the entity I am referring to are called vulgar fractions in Unicode/typography jargon. As far as I can tell, vulgar fractions are just fractions and are meant to contrast with decimal numbers (e.g. 1/2 = .5). I assume the term vulgar refers to usage among the general public (from the original Latin meaning of "people") vs. the scientific community who presumably stick to decimals.

If you are not concerned about typography a simple number + slash system is acceptable.

ENTITY CODES AND THEIR PROBLEMS

There are entity codes assigned to them in Unicode, but for Web purposes, I'm a little dubious about using them for the following reasons.
  1. They are inconsistently implemented. The codes for 1/2 (#189/U+00BD), 1/4 (#188/U+00BC) and 3/4 (#190/U+00BE) are in the Latin-1 block, while the codes for all the other vulgar fractions (thirds, fifths,sixths, eighths) are in the General Punctuation block (the 8500's/U+2150s). That means not all fonts support all fractions equally. One font may only have 1/2-3/4 but be missing the other fractions. Or the angle of the slash may be different.

  2. I noticed Dreamweaver in particular sort of has problems deciding how to display &frac; (1/2) vs. ⅓ (1/3). It's not just me by the way - this was also noticed by Lars Bruzelius on the CSS Discuss List.

  3. Key mathematical information could be lost. An entity code point combines two numbers (numerator and denominator) into one precomposed entry. This is why MATH ML makes fractions with both a numberator and a denominator.
  4. Screen readers might not recognize entity codes. Screen readers are always a little behind the curve in terms of new Web standards. Although a modern screen reader might understand the codes for 1/2 (½), it will likely not know what to do with &8531; for "1/3". On the other hand "number slash number" is more likely to make sense to a visually impaired user.
  5. Not all fractions encoded. Many common fractions have codes, but not all of them do. If you want 1/7 or 4/9, you're out of luck and have to use the "combining slash" instead.
  6. Legibitlity can be an issue. When using vulgar fraction codes, the numbers will be much smaller (another potential accessibility issue) and resizing them could be tricky. The CSS solution below allow for better control over your sizing.

A CSS SOLUTION

As I said before, the "number slash number" solution is usually fine for most documents, but you can use CSS to make prettier smaller scale fractions...but is a wee cumbersome.
Note: This solution was originally developed by Lars Bruzileus

First you have to shrink the numerator and the denominator to something like 75% - the slash stays at 100%. Then you have to raise the numerator up slightly (by .5 ex). You can also adjust the letter spacing depending on your font.

.den {font-size: 75%;}
.num {font-size: 75%; vertical-align:.5ex}

In the HTML the code looks like this:

<span class="num">1</span>/<span class="den">7</span>

And here's what it looks like:

1/7

So although I didn't like CSS for superscripts, I do think they are just the trick for vulgar fractions.

Categories:

Site Explaining Western European Character Sets

|

I can't believe I missed this, but Unicode guru Alan Wood has a great chart explaining the differences between Windows-1252 (ANSI) vs. Latin 1 (ISO-8859-1) vs. Mac Roman.

http://www.alanwood.net/demos/charsetdiffs.html

For people new to Unicode, this is the chart that explains why non-English characters don't always come out the same between Mac and Windows. Characters like British pound (£) were assigned different code numbers in Mac vs Windows.

These days Mac and Windows can usually translate between each other's encodings (technically both are Unicode)...but the glitches still occur from time to time.

Categories:

About The Blog

I am a Penn State technology specialist with a degree in linguistics and have maintained the Penn State Computing with Accents page since 2000.

See Elizabeth Pyatt's Homepage (ejp10@psu.edu) for a profile.

Comments

The standard commenting utility has been disabled due to hungry spam. If you have a comment, please feel free to drop me a line at (ejp10@psu.edu).

Powered by Movable Type Pro

Recent Comments