Testing Web browsers as Platforms for Hebrew Text Publishing

Given that one important aspiration of the Open Siddur Project is the development of a web application for anyone to edit, maintain, and share the content of a personal prayerbook that they can craft online, I’m very concerned at how well web browsers today display the Hebrew language with all of its diacritical (vowels, cantillation) and punctuation marks. Indeed, the Open Siddur Project has an international scope, so ostensibly, we wish to support text in every language Jews speak or have ever spoken liturgy or liturgy-related text (the creative content of Jewish spiritual practice). Combine a digital font or fonts that support the full range of human written languages with a platform that correctly displays such fonts, and you have one basis for an excellent potential collaborative publishing platform.

So for the last year, I’ve been working on a series of tests to determine how well some popular and some less well-known web browsers perform in supporting the technology for displaying Hebrew text. In particular, I’m interested to see which browsers are failing to use a web standard called CSS @font-face to properly display Unicode Hebrew fonts that support the full range of Hebrew diacritics and which contain excellent font logic for diacritical positioning. I’m also keen on seeing which browsers might even be failing at recognizing bidirectional (BIDI) and right-to-left (RTL) text, given that Hebrew is read RTL and it’s not uncommon to find עִבְרִית and other left-to-right (LTR) languages written together with one another.

With these tests I also hoped to find some simple way by which an individual browsing the web could troubleshoot whether the problem is in their browser, their browser’s settings, or in a web page, when they find a web page that is poorly displaying Hebrew. I learned a great deal in the process and so I also made a page for web designer/coders to learn the correct way to craft a web page that will correctly display Hebrew.

Cross-posted to the Open Siddur Project.

Varady’s Fabulous Flying Keyboard

Varady's Fabulous Flying Keyboard (Level 1)

Behold my Flying Keyboard!

Ever want a keyboard configuration you could switch to for odd characters‽ You know, so you could add an Ḥ in Ḥanukah without copying and pasting from this page (or your favorite “Character Viewer” program).

Well I made such a keyboard configuration that you can download and install on your very own computer. (Only works for Windows OS, alas.) Download, unzip and install.

The keyboard layout includes glyphs mapped onto the universal and international standard Unicode character encoding schema. You’ll have to use the layout along with a font (e.g. DejaVu Sans) that supports all of these glyphs. Such fonts are installed with the popular, cross-platform, free/libre and open source LibreOffice application.

Varady's Fabulous Flying Keyboard (Level 2)

I was tired of using the Windows Character Viewer to access a number of useful character glyphs including the Ḥ. So I made my own keyboard layout using the proprietary but free-without-fee program called “Microsoft Keyboard Layout Creator v1.4.” (Windows only, although it’s also possible to do something similar for Macs and Linux.) If you want to hack the keyboard layout I made, I’ve included the layout in a directory named “source” in the zip which also includes the images above.

Mac keyboard layouts are directly modifiable using a 3rd party free-without-fee tool called Ukelele. Re: Linux, like much of the rest of the configuration on *nix type systems, keyboard layouts for the X Window System are defined in easily editable text files. See this page for more info.)

UPDATE: For Windows users, Steg adds this useful information,

Go to your System Setttings and find the Language/Keyboard settings and add the input method “U.S. Extended”. Then start using it. To type a Ḥ type option-x and then H.

GNU General Public License + Font Exception

 

UPDATE: I managed to convince the army of volunteer editors to approve an article I wrote on the GPL+FE (General Public License with font exception clause). This after my initial disastrous foray into wikipedia article posting. For those counting, this is my third approved article on Wikipedia.

<hr />

Lately, for the Open Siddur Project, I’ve been putting together a font package for more easily distributing extant free/libre licensed Unicode Hebrew fonts. These fonts tend to be licensed with SIL’s Open Font License (e.g., EzraSIL and Cardo), or the GNU General Public License (GPL, e.g., Maxim Iorsh’s Culmus Project fonts). Because of the differences between fonts and other software code in their usage, there arose some conflicts which necessitated an exception to the GPL specifically for fonts. Unfortunately, the GPL font exception statement is somewhat buried in the Free Software Foundations GPL FAQ. Because important information on the GPL+FE is nowhere on the Internet included in one single post, I’ve reformatted it and shared it below.

From the Free Software Foundation’s GNU General Public License FAQ, “How does the GPL apply to fonts?“:

Font licensing is a complex issue which needs serious consideration. The following license exception is experimental but approved for general use. We welcome suggestions on this subject—please see this this explanatory essay and write to licensing@gnu.org.

To use this exception, add this text to the license notice of each file in the package (to the extent possible), at the end of the text that says the file is distributed under the GNU GPL:

As a special exception, if you create a document which uses this font, and embed this font or unaltered portions of this font into the document, this font does not by itself cause the resulting document to be covered by the GNU General Public License. This exception does not however invalidate any other reasons why the document might be covered by the GNU General Public License. If you modify this font, you may extend this exception to your version of the font, but you are not obligated to do so. If you do not wish to do so, delete this exception statement from your version.

The drafter of the GPL+FE statement above, explained the need for the GPL+FE in the following post, “Font Licensing” (FSF 2005).

Font Licensing

by novalis Contributions — last modified May 17, 2010 16:43

By David “Novalis” Turner

There has been some recent confusion about font licensing. Since I wrote the font exception, let me tell you a bit about where we are, and how we got there, and what this all means to you.

First, in the US, the copyright status of fonts is somewhat confused. A font face — that is, the look of a font, is not copyrightable (see Eltra Corp. v. Ringer, 579 F.2d 294 (4th Cir. 1978)). But font “programs” (truetype fonts, for example) are. Another ruling has extended the definition of “programs” to include certain outline data. Why this outline data is not equivalent to a font face, nobody knows. Helpfully, the copyright office has also issued contradictory statements on this. I don’t know how font copyright works in other countries.

What this means is that no font is going to affect the distributability of a printed document in the US. Further, merely referencing the font (as in the CSS font-face: caslon;) does not create a derivative work of that font. So why did we worry about font licensing at all?

The situation we were considering was one where a font was embedded in a document (rather than merely referenced). Embedding allows a document to be viewed as the author intended it even on machines that don’t have that font installed. So, the document (a copyrighted work) would be derived from the font program (another work). The text of the document, of course, would be unrestricted when distributed without the font.

This isn’t an artifact of the GPL; it’s just the way fonts work. Proprietary fonts often explicitly forbid embedding. So, if you want to send your document off to a printing service, the printing service needs to buy another copy of the font.

I was unhappy with even this amount of influence for fonts, because (a) it’s rarely what font authors intend and (b) it’s possible that some applications do embedding behind the user’s back. The situation seemed to me to be similar to the case of the runtime libraries which GCC automatically includes in its output (and which are licensed to permit inclusion in proprietary software). So, I wrote the font exception you see on our web site.

The reason the exception is so limited is that we’re worried about someone extracting a font from a document, and redistributing it. Extraction is, in my view, the major issue that a font license must confront. Because I haven’t been able to come up with a license which correctly handles embedding and extraction in all cases, I’ve restricted this exception to unaltered fonts. This means that someone can’t use embedding as a way to distribute a modified version of a font under restrictive terms. If you have suggestions for how to write a license which better handles extraction, please let us know. We haven’t had time to give this as much thought as we’ve given some of the other issues involved in free licensing. We’re especially interested in hearing from font creators at licensing@gnu.org.

Text Cloud of the Omphalos

Behold, my Omphalos as digested arithmetically (with some aesthetic treatments) by Jonathan Feinberg’s text cloud application over at wordle.net. Makes for a rather elegant visual poem, no? The wordle engine accepts site URLs, RSS feeds, or giant gobs of text. The latter is what I fed it after copying the source of my ATOM feed and removing all the links, html, and other xml cruft using NoteTab. Hat tip to Jamais Cascio over at Open the Future for sharing the coolness.

The application provides some control over the appearance of the cloud. You can configure how many words appear (I chose 200). There are also settings for the orientation of the words (vertical/horizontal), palette, and font choice.

Some comments. It doesn’t appear as if the wordle engine is context sensitive to words that appear in close proximity to each other; place names like Bond Hill and Baton Rouge are thus not recognized as such. It would also be nice if common words such as “like” and “also” could be filtered out or relegated to the background as glue for more significant nouns like “heierophant” and “cosmogonic”.

Still, looking into the world cloud as a mirror of my writing over the last three years or so is interesting. All those music related terms are surely the result of importing all the posts I made over at mog.com in 2006 and 2007. Should I be as surprised as I am that this blog is so “Jewish”? Probably not.

Joe Lamantia has written more about text clouds here. (A tag cloud with all the tags and catgories of articles posted at the Omphalos appears on the right sidebar.)

The Forbidden iPod: HFS+ on Windows

Last year around this time I was thinking about mp3 players. My trusty old Archos Jukebox 20 Studio just wasn’t cutting it anymore, even with its ROM flashed with open source Rockbox firmware. Yes, the Archos was a solid brick of an mp3 player, had a simple yellow LCD display, USB 1.1, and a very short battery life which required me to carry around its AC adapter wherever I went, but that’s not the reason I gave it up. I wanted “Album Shuffle”: the means for shuffling your songs by random album rather than random song. This is an important feature if you want to listen to any album that isn’t an 80s pop album with only one or two good songs on it, like for example, Vivaldi’s Four Seasons or Pink Floyd’s Wish You were Here. The order of tracks, representing movements or songs in a larger themed composition, matters. (I’ve written more about Album Shuffle here.)

Then I noticed that the ipods had album shuffle. The new players from Cowon and Archos did not, nor any others that fancied themselves as ipod competitors. But I still wasn’t convinced to buy an ipod yet. My trusty if heavy and slow Archos had a (then enormous) 100gb hard drive that I had installed myself and the largest ipod then available was 60gb. Ahh, but just before my birthday Apple announced their release of a new 160gb ipod. I was won over. Soon I gifted myself with a new Ipod Classic 160gb.

iPod Management

When it arrived, the ipod’s hard drive came formatted with Apple’s native file system, HFS Plus (HFS+). As the Windows operating systems cannot natively read HFS+ drives and my Thinkpad runs Windows XP, iTunes reformatted the ipod with FAT32, a file system engineered by Microsoft. At the time I didn’t think too much of HFS+ vs. FAT32, I was just happy that the ipod was working. And so, I put all concerns about file fragmentation and the need to periodically defrag FAT32 volumes to the side, and got to work filling the ipod up with good music and videos.

Over the last year I’ve learned how to corrupt my ipod’s database (and how to fix it painlessly) by avoiding iTunes as much as possible. iTunes had the advantage of supporting Album shuffle, but I preferred to use Winamp with the Album List plug-in for listening to albums on my computer. I had some success using Floola (which does not support Album shuffle) and Floola is my choice ipod manager on my Linux boxen. But on my Thinkpad running Windows XP, I was more interested in whether there were any plug-ins for Winamp that could suffice as a fully featured alternative to iTunes.

Looking at Winamp I discovered that it supported iPods through a plug-in bundled with the Winamp installer called pmp_ipod. Trying it out I was underwhelmed by its poor support of album cover art on the ipod, a feature I had really come to love. Then I discovered ml_ipod — an open source winamp plugin written by independent developers that could do (almost) everything pmp_ipod could do but better. The only thing I would need iTunes for would be for occasional firmware updates. ml_ipod support was fairly well documented on an online wiki and any further questions could be pursued on an active support forum hosted at Winamp. I’ve been using ml_ipod since January and have donated money to the further development of the plug-in.

File Fragmentation in FAT32 vs. HFS+

A few weeks ago I began to wonder again what my ipod’s FAT32 volume file fragmentation looked like. Unsurprisingly, after tens of thousands of file transfers, the ipod’s music, video, database and artwork files were critically fragmented according to Diskeeper, a windows defrag tool. A fragmented file system meant that my ipod needed to work harder and slower than it should have to. The answer to a fragmented ipod file system isn’t defragging it though. Ever wonder whether you should defrag your ipod? Don’t waste your time. Defragmenting an ipod over USB takes a LONG time. It is much much faster to simply do a full restore from your computer’s existing archive of music. (Before doing so, make sure you have an archive of all your iPod’s music.)

Even after I initialized and reloaded my FAT32 ipod, I found that the the iTunes database of music files as well as the artwork database of cover art were still fragmented — just less so. I began to explore what benefits there might be to manage the ipod with its original HFS+ over FAT32. I was impressed to find that HFS+ drives do not suffer from the same fragmentation problems as FAT32 drives. As this comparison of file systems shows, the main reason for the lack of fragmentation in HFS+ is because unlike FAT32, HFS+ supports Extents. Wikipedia explains:

An extent is a contiguous area of storage in a computer file system, reserved for a file. When starting to write to a file, a whole extent is allocated. When writing to the file again, possibly after doing other write operations, the data continues where the previous write left off. This reduces or eliminates file fragmentation.

Additionally, because HFS+ was specifically engineered to minimize disk access and quickly access individual files, its specific utility for the iPod seems obvious. This specific advantage of HFS+ over FAT32 was summarized well by the user, “Millenium,” over on the macnn web forum in a 2006 thread on HFS+ vs. FAT32:

You may hear that HFS+ is slower than FAT32. That’s true in some cases, but not in others. In particular, HFS+ does not do very well in tasks where you need to access many small files at once…

For looking up individual files, however, HFS+ is actually one of the fastest filesystems out there, and has been for a long time. This all comes from the way that HFS+ stores its data: when you’re working with relatively few files it’s better, but when you’re working with many files at once it isn’t as good. It’s a design tradeoff, and whether it will be better or worse for you in this regard really depends on how you use your computer.

The original Macintosh File System (MFS, from which HFS and then HFS+ directly descend) was created in an era when most people used floppies to store all of their data. The same is true of FAT16, which is where FAT32 comes from. Apple’s engineers decided that since floppies were so slow, people and applications would try to minimize disk access in general, and so they optimized their filesystem to work best under those conditions. It worked extraordinarily well for the time, and even today there aren’t many better filesystems for people who work under those conditions.

In other words, one of the best file systems available for the iPod is HFS+ (especially compared with FAT32). Unfortunately, FAT32 is not a comparable alternative to HFS+. FAT32′s presence as an alternative file system for the ipod simply reflects the lack of support in Windows OSes for the more advanced HFS+ file system.

Perils of FAT32 to HFS+ Conversion

As a result of learning this, I became increasingly interested in converting my FAT32 ipod to HFS+. Besides fragmentation and reliability, I also wondered if a change in ipod file systems might affect the file transfer speed over USB 2.0. File transfer speeds over USB 2.0 with my FAT32 formatted ipod averaged around 6000 kB/s. Would HFS+ perform worse or better?

General information on converting the iPod from FAT32 to HFS+ was plainly lacking and specific recommendations advised iPod users to accept FAT32. I was on my own. To access HFS+ formatted drive volumes on Windows I’d need to install special software like MacDrive by Mediafour. So to begin, I downloaded the MacDrive software and formatted my ipod to HFS+. So far so good. I wanted to make certain that my firmware was installed correctly so I proceeded to initialize my ipod with iTunes, and then re-transfer my mp3s and mp4s to the newly formatted ipod with winamp + ml_ipod. This seemed to work fine (although I didn’t see any discernible change in file transfer speeds). But afterward, I was surprised to find that my ipod was still formatted with FAT32! I soon learned that as part of its restore sequence, iTunes for PC will automatically format HFS+ formatted ipods with a FAT32 file system. It also copies ipod for PC firmware that seems tailored specifically for FAT32.

In my next attempt, I reformatted the ipod to HFS+ with Macdrive, ignored iTunes altogether, and did a full restore with ml_ipod in winamp onto the Ipod. ml_ipod recognized the drive and transferred the files. This time the file transfer speed was much higher: 9500 kB/s vs. 6000 kB/s. I was impressed but once the transfer completed, I found the ipod would not recognize any of the files that had been transferred. The itunesDB database was not corrupt and the actual data files were all present so what could be the problem? Was it a problem with the iPod’s firmware not being able to read HFS+?

I found the answer on a wiki page written for Gentoo Linux users on how to update ipod firmware. Simply formatting the ipod’s drive to HFS+ would not work because HFS+ formatted ipods have three partitions: the first partition contains the partition table, the second partition the ipod for mac firmware, and the third partition the media files and databases. (FAT32 formatted ipods have two partitions: a hidden one for the ipod for pc firmware, and the other for the media.) The ability to create these HFS+ partitions on the iPod aren’t available on Windows, even with MacDrive. MacDrive can format a disk to HFS+ but does not provide the ability to create three separate partitions on the disk. And to make the ipod work, I would also need the correct ipod firmware installed in its respective partition. Could iTunes solve the problem? iTunes for PC will neither create the three HFS+ partitions nor copy anything but ipod for PC firmware to a FAT32 partition. The only solution I could imagine for copying the correct firmware and creating the correct partitions would be by connecting my ipod to a computer running OS X and restoring my iPod using iTunes for Macintosh.

So iPod USB cable in hand, I visited my friend Isaac S. and his Macbook, and soon afterward I had a functioning ipod with the correct HFS+ partitions and firmware. (Thanks Isaac!) Back home, I found that with MacDrive installed on my Thinkpad, ml_ipod and winamp had no difficulty recognizing the HFS+ volume. Transfer speeds hovered mid 8000 kB/s. Success!

The conversion did not come without any caveats. After the full transfer was completed I did notice that there was less free space available on the ipod. The ipod with HFS+ used approximately 5% more storage for the same files than when it was formatted with FAT32. (116gb/FAT32 vs. 122gb/HFS+ out of 148gb total.) I don’t know why, but perhaps it has something to do with the extents allocated for each file in HFS+ (described above). (See update below on this weird problem.)

Because ml_ipod was designed to restore fat32 formatted ipods, I don’t think I’ll be able to use ml_ipod’s “restore or initialize ipod” feature anymore, nor will I be able to rely on iTunes for PC for the occasional firmware update. Rather than buy a whole new Apple computer for this task, I’m looking at vmware workstation, an emulation environment that I can run OS X on within Windows. Another option is to use another piece of software by Mediafour called Xplay.

Conclusion

I hope this story helps anyone else out there wondering whether to get their FAT32 ipods converted back to HFS+ (and how exactly to go about doing that). I think it’s a worthwhile project because of the advantages that HFS+ provides in speed and reliability over FAT32, the lack of file fragmentation in HFS+, and some moderate file transfer speed advantages. The disadvantages  are the need to purchase HFS+ software for Windows like MacDrive and no longer being able to depend on iTunes for firmware updates or ml_ipod for occasional full restore and ipod initialization. (You can probably get around the latter problems by installing Mac OS X in a vmware emulation, but then you’d need to buy a copy of vmware workstation and OS X as well. Or you can buy a mac mini, macbook, or other Apple computer.) If this doesn’t faze you, then you should also expect that due to differences between the two file systems, that HFS+ will utilize more storage space on your ipod than FAT32. On my ipod, HFS+ used 5% more drive space with the same files loaded onto it.

If you want to run an HFS+ formatted ipod on a PC running Windows, follow these steps:

  1. If your ipod is formatted FAT32, restore it using iTunes for Mac on a friend’s Macintosh computer. (iTunes for PC will only format your ipod to FAT32.)
  2. Install HFS+ reading/writing software for Windows like MacDrive by Mediafour.
  3. Optional but recommended: Install ml_ipod for winamp and transfer your files to your HFS+ formatted ipod.

In the comments please let me know if you’ve found other ways to partition ipods correctly for HFS+ without using iTunes for Mac. Besides file transfer speed changes and degrees of fragmentation, I’m also interested in documenting any other reported benefits of using HFS+.

UPDATE: A week later and I’ve reloaded my ipod once more under slightly different conditions. The important difference is that this time, the strange 5% storage space loss from my earlier adventure didn’t manifest. Instead of restoring the iPod using ml_ipod, I used XPlay (ver. 3.0.2), another piece of software by Mediafour. I’m not exactly certain what made a difference… but my iPod certainly seems happier having been reformated with MacDrive and restored with XPlay. XPlay has a trial period of 30 days or 20 times running, and I’ll be curious to know whether the software makes any difference to managing an HFS+ formatted iPod besides using its restore feature. I’ll provide another update to this post when I do.