*** MOVED ***

NOTE: I have merged the contents of this blog with my web-site. I will not be updating this blog any more.

2007-05-26

Indic Scripts and Linux

If you have the fonts for Indic scripts (for example, the Lohit fonts), Firefox on Linux is able to display the Devanagari text on sites like BBC Hindi and Google News in Hindi. (Devanagari is the primary writing system for languages like Hindi, Sanskrit, etc.) However, if you are using the builds released by mozilla.com, you would notice that the matras (diacritics) are not applied properly to form the correct ligatures. For example, the word "हिन्दी" ("Hindi") itself is not rendered properly. Konqueror does not suffer from such problems.

It turns out that Firefox does not support complex text layout because it doesn't use Pango in the officially-released builds (Firefox 3 will support it by default). You have to either compile it yourself from the source and enable the support for Pango by using --enable-pango, or use a build that has Pango enabled - for example, the builds provided by the Fedora Project. (Setting the environment variable MOZ_ENABLE_PANGO to "1" had no effect for me with Firefox 2.0.0.3.)

On Fedora Core 6 (FC6), it is very simple to get this working:
  1. Install the fonts for the Indic scripts you are interested in. For example, "sudo yum install fonts-hindi" , "sudo yum install fonts-malayalam", "sudo yum install fonts-kannada", etc.
  2. Install a Firefox build for Fedora using "sudo yum install firefox". Note that FC6 installs Firefox 1.5 by default - if you prefer Firefox 2.0 instead, you can install it using "sudo yum --enablerepo=development install firefox".


By the way, I recently came across Omniglot, a site about the writing systems of almost all known human languages, existing or extinct, naturally-evolved or artificially-created. I found it extremely fascinating and insightful. For example, I did not know that Devanagari was not considered to be an "alphabet" but an "abiguda". Check out the International Phonetic Alphabet (IPA) that can represent almost all spoken languages. How about Loglan (and its freer derivative, Lojban) that claims to be a "logical" language? (I first came across the IPA on Wikipedia, where it is used to provide the pronunciation for some terms. xkcd is where I first read about Lojban.)

2 comments:

  1. Hi! Is there any OTHER browser one can use to render devanagri (other than Konqueror, which is very limited) in Linux distributions that don't support Complex Scripts? I have this problem with PCLinuxOS 2007, a distribution that i love, but in which Firefox (and also Opera) don't show Hindi pages correctly. Konqueror does show them OK, but it doesn't seem to show the hindi transliteration button in Blogger, for instance. I don't have this problem with Ubuntu - once one enables Hindi support in 'Preferences', firefox shows hindi pages perfectly. But i would like to have this feature activated in PCLinuxOS too...

    ReplyDelete
  2. Gaurav: Is there a build of Firefox available for PCLinuxOS separate from the default builds provided by Mozilla that has Pango enabled? (Check "about:buildconfig" to know the options used to build your version of Firefox.)

    For example, on Fedora 7, I have switched to using the Firefox build provided by the Fedora team rather than that provided by Mozilla.

    Recent versions of Konqueror (e.g. that with KDE 3.5.7) seem quite capable. If the Hindi transliteration button doesn't show up in Blogger with Konqueror, you might want to take up this issue with the Blogger support folks.

    ReplyDelete

Note: Only a member of this blog may post a comment.