The following tests help identify errors in the way graphical web browsers render Hebrew text encoded with the Unicode (UTF-8) standard. The tests aim to be comprehensive, beginning with whethter the browser displays UTF-8 text at all and if so, what is the default browser display font for Hebrew. The tests continue to investigate the extent to which Unicode Hebrew is supported with proper font positioning with CSS
@font-face. Care has been taken to use only fonts that support correct diacritic positioning, however, we're also interested in how errors are displayed with fonts that do not support the correct positioning of all diacritics. Finally, we look at @font-face support within
SVG files (only SVG 1.1 for now, but we're very curious about SVG 2).
I wrote these tests to aid developers in testing various web browsers as platforms for their web applications. To contact me to correct errors or otherwise help to improve this page, please write to me (Aharon Varady)
here.
When I first made these tests in late 2010, the
Open Siddur Project was interested in which browsers were failing basic Unicode and
bidirectional text support for Right-To-Left (RTL) languages like Hebrew, and which browsers were not yet fully supporting
Unicode (UTF-8) Hebrew fonts with CSS @font-face. Since then, we've seen most browsers come to support @font-face, although color coding diacritics remains a challenge and support for @font-face in SVG 1.1 is hit-and-miss.
To see an old chart listing how well each browser supports Unicode Hebrew with CSS @font-face, see
here. The chart also includes support of Hebrew in text-based browsers. (
mlterm and Apple terminal provide the best CLI-browser support.)
For a HOW TO guide on how to add Hebrew with CSS @font-face to your webpage, look at the source-code for this page as a reference. There's also an older guide that needs updating
here.
If you are looking for the Open Siddur Unicode and open source licensed font pack, see
here.
If you are looking for instructions on how to set up your keyboard and fonts to type in Hebrew, see
here.
Baseline Informational Tests
These tests provide you, the user with some helpful basic information as to the default configuration of your browser and possibly your operating system.
Default Character Encoding and BIDI (bi-directional) Text
This test should reveal whether your browser is configured to use UTF-8 as your default character encoding, or whether it is set as ISO-8559-1 or some other encoding. Sometimes, the browser will right-align the text when it recognizes that the text is in the Hebrew code range, although, in my experience, this is uncommon.
The test uses iframes to display plaintext Hebrew encoded in UTF-8.
In the following iframe, there is no Content-Type header informing the browser to recognize the referenced TXT file's text data as encoded in the standard Unicode (UTF-8) format. So, if the following looks like mojibake (gibberish), then your browser either does not have UTF-8 set as the default character encoding or something else is amiss.
Compare the above with the following text in which the Content-Type is declared as plaintext UTF-8 in an .htaccess file, thus telling the browser to display the text as UTF-8.
If you'd like to test this yourself on your own server, use the following code in an .htaccess file in the directory serving the TXT files.
<Files "pangram-utf8.txt">
ForceType 'text/plain; charset=UTF-8'
</Files>
Default Browser Font
This test should reveal what font Hebrew will be displayed by default in the web browser when it encounters UTF-8 Hebrew defined as such in the Page Source. It will indicate what the default font (if any) is set for viewing Hebrew text in your browser without invoking CSS @font-face.
If you see any empty rectangles or other generic space marks in the text below, you may want to change the default browser font to one that is known to support the positioning of the full range of Hebrew Diacritics (see
here for such fonts).
CSS @font-face tests
If the browser renders the following Unicode Hebrew fonts using CSS @font-face, the browser passes this test.
Diacritical Mark Positioning Test
This test uses CSS @font-face with Ezra SIL SR.
Ezra SIL SR correctly supports cantillation marks (t'amim) and vowels (niqqudot). In this test, look carefully to see whether the niqqud are correctly positioned under the letters and that the dagesh is correctly positioned inside letters (rather than offset). If diacritics are correctly positioned then this test is indicated with a PASS (otherwise it FAILS).
CSS @font-face (Ezra SIL SR) without any diacritical marks
ותוצא הארץ דשא עשב מזריע זרע למינהו ועץ עשה פרי אשר זרעו בו למינהו וירא אלהים כי טוב
CSS @font-face (Ezra SIL SR) with diacritical marks (vowels only)
וַתּוֹצֵא הָאָרֶץ דֶּשֶׁא עֵשֶׂב מַזְרִיעַ זֶרַע לְמִינֵהוּ וְעֵץ עֹשֶׂה פְּרִי אֲשֶׁר זַרְעוֹ בוֹ לְמִינֵהוּ וַיַּרְא אֱלֹהִים כִּי טוֹב.
CSS @font-face (Ezra SIL SR) with diacritical marks (vowels and cantilation)
וַתּוֹצֵ֨א הָאָ֜רֶץ דֶּ֠שֶׁא עֵ֣שֶׂב מַזְרִ֤יעַ זֶ֙רַע֙ לְמִינֵ֔הוּ וְעֵ֧ץ עֹֽשֶׂה־פְּרִ֛י אֲשֶׁ֥ר זַרְעוֹ־ב֖וֹ לְמִינֵ֑הוּ וַיַּ֥רְא אֱלֹהִ֖ים כִּי־טֽוֹב׃
Default Rectangle Glyph Test
This test uses CSS @font-face with Miriam CLM.
Miriam CLM was chosen to test how browsers will display a missing glyph. Miriam CLM does not contain the full set of Hebrew diacritical marks (it only contains the niqqud) and its font logic should not support proper positioning of t'amim. In the example below, the text contains t'amim. The wb browser may either indicate the missing t'amim with the (dreaded) Default Rectangle Glyph between the Hebrew letters, or it will simply not show any indication of missing diacritics. Alternately, the browser might borrow the t'amim and positioning logic from another font -- possibly the default browser font set for Hebrew, indicated above.
CSS @font-face (Miriam CLM) with diacritical marks (vowels and cantilation)
וַתּוֹצֵ֨א הָאָ֜רֶץ דֶּ֠שֶׁא עֵ֣שֶׂב מַזְרִ֤יעַ זֶ֙רַע֙ לְמִינֵ֔הוּ וְעֵ֧ץ עֹֽשֶׂה־פְּרִ֛י אֲשֶׁ֥ר זַרְעוֹ־ב֖וֹ לְמִינֵ֑הוּ וַיַּ֥רְא אֱלֹהִ֖ים כִּי־טֽוֹב׃
Unicode Range Test
This test uses CSS @font-face with Unicode range u+0590–05FF (Hebrew) with the HaGilda Alef font and Unicode range u+0000–007F, u+1E00–1EFF (Latin and Latin Extended Additional) with the Linux Biolinum font. (
shalom ḥaverim means, "hello friends".)
שָׁלוֺם חֲבֵרִים | shalom ḥaverim
Diacritic Color and Positioning Test
In the following test, Hebrew letters, diacritics, and punctuation are separated by <span> tags and colored blue, red, and green respectively as defined by classes. Are the diacritics visible? Are they a different color than the Hebrew letters? Are they positioned correctly?
שָׁלוֺם עוֺלָם׃
Here's the same test as above except that the characters allowed in the classes are defined by a unicode-range.
שָׁלוֺם עוֺלָם׃
Here's John Dyer's
layering solution. Note that copy/pasting this text might result in more than one copy of the text.
שָׁלוֺם עוֺלָם׃
שָׁלוֺם עוֺלָם׃
שלום עולם׃
שלום עולם
Here's the same solution, but applying unicode-range in @font-face to define and separate letters, vowels, and punctuation.
שָׁלוֺם עוֺלָם׃
שָׁלוֺם עוֺלָם׃
שלום עולם׃
שלום עולם
Directly associating a color for a unicode-range in @font-face will have to wait for the next iteration of CSS (if the proposal to enable this is accepted). Unfortunately for now, I can't just put the separate classes for letters, vowels, and punctuation together in one span. The last style defined will always take precedence. (
Let me know if you know a trick to make this work though.)
SVG Test
In this test, the SVG 1.1 file
on the left invokes CSS 2 to render the Hebrew word
סידוּר with the Miriam CLM font.
On the right is the same SVG file with the text converted to path. Both SVG files should match each other. If the font in the SVG image does not match the reference SVG
on the right, the browser fails.