August 15, 2005

Estonian and Icelandic

I have a large spreadsheet containing data on what percentage of the Internet population speaks which language. If I tell it which localisations we have in preparation for Firefox 1.5, it tells me we will be able to provide software in the native language of 95% of the Internet population. This compares with 92% for Firefox 1.0.

If I then tell it which localisations have registered projects for Firefox, it tells me that when all of them produce a language pack, we'll have covered 99.93% of the Internet population. Estonian and Icelandic are the only two languages in my list that we don't have a localisation team for.

(The 99.93% figure isn't actually strictly accurate, because my data excludes a large number of languages whose populations are too small to register individually, but collectively may make up a significant proportion of the net population. But hey, if they speak Catalan, Basque, Irish, Gujarati, Armenian, Macedonian, Mongolian, Albanian, Afrikaans, Asturian, Belarusian, Lithuanian, Frisian, Kinyarwanda, Khmer, Singhalese or Welsh, we've got them covered anyway.)

Kudos go to the Mozilla Localisation Project staff for a truly Herculean effort of coordination and management.

Posted by gerv at August 15, 2005 09:21 AM | TrackBack
Comments

Basque? Impressive.

Posted by: Axord at August 15, 2005 09:58 AM

This is a bit off topic, but I'm curious as to why Mozilla uses the 2 letter codes instead of the 3 letter ones. I'm not a linguist, but my father is and according to him, the 3 letter codes have been the standard in linguistics for quite a while already.

Posted by: Alan Trick at August 15, 2005 10:21 AM

There are 3 million Basques of which 10% at least can write it.

Posted by: Jerome Lacoste at August 15, 2005 10:36 AM

Estonian should be in progress?!.
I mailed to person who translated Seamonkey, he said he's workin on it. (Ok, it was months ago)

This should be the site address - http://mozilla.gf.ttu.ee/ (Currently down)

Posted by: Jers at August 15, 2005 11:11 AM

Eh. Between 3000 and 5000 languages out there. Who's going to translate it into Burushashki? And Klingon?-)

Posted by: Daniel Glazman at August 15, 2005 12:15 PM

Alan: the two letter codes are standard on the Net for things like Accept headers and so on. In fact, we've recently changed from all locales having both a language and a country code (en-US, da-DK, zh-CN) to some just having a country code - a backwards step for understandability and neatness, in my view.

Posted by: Gerv at August 15, 2005 01:19 PM

Daniel, how many of those languages have fonts? Or, how many of those are in unicode? I recall that Klingon is at least not in the unicode foo that moz supports. Whichever that is, what we call utf16.

Posted by: Axel Hecht at August 15, 2005 01:27 PM

> we've recently changed from all locales having both a language and a country
> code (en-US, da-DK, zh-CN) to some just having a country code
> a backwards step for understandability and neatness, in my view.

That's probably an anglo-saxon view, based on having en-GB vs. en-US, but even this needs to be localized ;-) In fact, for languages somewhat standardized across several countries like German or French, the new scheme is far better. Because it is not just a country code as you say, it is the reverse: just a *language* code.

For example, in the French l10n team we have contributors from France, Québec (Canada), Belgium, Switzerland, an so on. Saying that we are working on "Generic French" instead of "French as it is spoken in France" certainly helps to ease certain susceptibilities. Not only inside our team, but also for our international users.

Posted by: Benoit at August 15, 2005 09:52 PM

"to some just having a country code"

Actually, just the opposite - many locales have only language code, when the language is spoken in just one country or when there's no significant difference between speakers of that language in different countries - e.g. Polish [formerly 'pl-PL', now just 'pl'] is spoken in Poland, some of the neighbouring countries, parts of Russia, Kazakhstan and the USA, but there's no need for a separate pl-RU, pl-KZ or pl-US l10n project).

Posted by: marcoos at August 15, 2005 10:46 PM

If Mac users whish to translate Camino in those languages they are welcome to join the caminol10n.mozdev.org team .

Posted by: Ludovic Hirlimann at August 16, 2005 03:29 PM

Gerv, where do you have that spreadsheet from? I'd be very interested in this data - would you perhaps post a link?

Posted by: jens.b at August 23, 2005 08:21 PM

jens.b: The spreadsheet is based on the data from here.

Posted by: Gerv at August 26, 2005 12:28 PM
Post a comment





(You may wish to obfuscate)




Remember personal info?


This entry box accepts some HTML. You will need to escape < as &lt; and > as &gt;. Useful tags: <blockquote>, <b>.