Word of the Day

Wednesday, March 28, 2007

Bidirectinal: Hebrew, Arabic and English in the Same HTML Document

I have been looking for an easy way to create a multilingual HTML document with languages using different writing directions. Here are my latest prowess in the brave new world of multilingual computing.

1. dir attribut of HTML tags
HTML tags have dir (text direction) attribute, which is either "rtl" (right to left) or "ltr" (left to right). The example is a paragraph written in Hebrew, a rtl language.

אני רוצה ללמוד עברית.


The coding that yields this result is as follows:


2. Mixing text directions in the same line
Placing rtl and ltr characters in the same block confuses the browser. Word with a text direction in a larger block (such as a table or a paragraph) with another should be explicitly provided with the proper direction.

I went to ירושליים‎, יפו‎ and חייפה‎.

Each Hebrew character string has an invisible Unicode control character right after it. This Unicode character U+200E, called the LEFT-TO-RIGHT MARK, is used to align the rtl strings in the ltr block:
I went to ירושליים‎, יפו‎ and חייפה‎.

The order of the characters in memory is shown below from left to right (where bold italics should actually be replaced by Hebrew script and LRM stands for U+200E);
I went to JerušalajmLRM, JafoLRM and ḤejfaLRM.

English words in a Hebrew block require another control character, U+200F or RIGHT-TO-LEFT MARK. This is used to align ltr strings in rtl blocks.

הייתי ב-Washington‏, Boston‏ ו-Chicago‏.


The coding is;

A conceptual indication of memory order is shown below. Despite the natural text direction of Hebrew, the same convention is followed as above. RLM stands for U+200F.
Hajiti be-WashingtonRLM, BostonRLM, we-ChicagoRLM.


N.B.
For comfortable editing of bidirectional HTML documents, you might want a text editor with strong Unicode support. I find SC UniPad as a good option that enables users to view the oft-invisible RLM and LRM characters.

You can save the example codes as HTML and test them by opening the file by Firefox.

No comments: