Vernacular localization: Tricky issue

For a foreigner living in Germany there are several cultural and vernacular differences that you notice straight away, and several more subtle ones that take a while for you to realize, even though they have been staring you in the face the whole time.

Something that is important to get correct from the start is time-keeping. 7:30 is called “halb acht” in German, which translates as “half eight”. While it first appears that the Germans are calling it half to eight instead of half past eight, the reality is they’re saying half of eight. This allows constructs like “drei viertel acht” or “three quarters of eight” for 7:45, and “fünf vor halb acht” or “five to half of eight” for 7:25. This seems perfectly normal in Germany. Eventually. There are several other time-based peculiarities in German, such as the word for “tomorrow” being the same as the word for “morning” (“morgen”), so that if you want to say “tomorrow morning”, you have to say “early tomorrow” (“morgen früh”), whereas if you want to say “this morning”, you have to say “today morning” (“heute morgen”).

Date handling in Germany is standardized in dd.mm.yyyy format instead of dd-mm-yyyy or dd/mm/yyyy. At least it’s not mm/dd/yyyy, which is the bane of my life. Anytime I see 4/6/2010 I have to look for more dates on the same page or document until I find one where the first or second element is greater than 12, or I will have no idea whether it’s an April or June date.

When the number 100.001 is written in English speaking locales it (usually?) means “one hundred point zero zero one”, whereas in Germany, it means “one hundred thousand and one”. 100,001 in Germany means “one hundred point zero zero one”. For a reason I can’t grasp, the German and English localization systems evolved to use exact opposite characters for thousands separation and decimal separation.

Something I didn’t notice until this week though is that currency formatting differs between Ireland and Germany. In Germany the Euro symbol is put after the amount (Except in a badly localized flash ad that’s there now), but in Ireland the symbol is in front of the amount.

Literal or near-literal translation brings more entertainment. Most German people I know have a Chef that they see most days of the week, but who does not cook for them. How’s that for a false friend?

Although most simple sentences similar to “I am hungry” can be translated to “Ich bin hungrig”, it is not correct to translate “I am cold” as “Ich bin kalt”, because that means “I am a cold-hearted person”. Instead it must be “Mir ist kalt”. In the same vein, “How are you” – “I’m good” doesn’t work in German. The “Ich bin gut” means “I’m a morally good person”, which is a non-sequitur in this context. The answer has to be “Mir ist gut” or “Mir geht’s gut”. I also found out that when a waiter asks if I’d like another beer, and I don’t, “No, I’m Ok” doesn’t translate to German very well. Sometimes these things are discovered when the reply is laughter and a “Yeah, we think you’re ok too Steve…”.

My old favorite German term is “krass”, which is pretty much an emphasizer, usually I hear it as a negative, but can be used in a positive too. The weather can be krass. A movie can be “krass lustig”. The best (“krasste”?) moment for krass was when a friend of mine described a particular situation as being “krass krass”. My new favorite German phrase is “Der Hammer!”. Der Hammer is the best thing there is. In en_US that would be ‘Da Bomb!’ I guess.

On top of all the peculiarities and differences in expression, native speakers of English living in or visiting Germany are called “native speakers”. To me “native” means “indigenous to the current geography”, so I still find that a bit weird.

It’s not that English doesn’t have it’s fair share of peculiarities. Why does inflammable mean flammable, for a common example? Even within English there are wild deviations from the norm. That’s a great article by the way. If you want to know what “long monophthongs are often dipthongized, and while some diphthongs are tripthongized” means, take a break from the MeeGo conference and go to Moore Street for “Bananids five for a pow-und”. A notable omission from the Lexicon on the page though is ‘fierce’, which translates to German as approximately ‘krass’. With all that, it’s not uncommon for natives of the Queens English to have problems understanding Hiberno-English, and not just because of the accent. Check this Video with NSFW Audio to more from the Wexford Lexicon without the accent and see if you can make any sense of it.

The point is that localization issues are things we encounter everyday, within and without computer systems but they have to be handled in computer systems too.

Mann, eh?

Abstract localization: Tricky issue

For Qt developers the first thing localization means is a choice. Do the Qt built-in localization facilities meet the applications needs? Would gettext be different? Can KLocale be used instead? The answer to all three is, of course, Yes. Qt localization is quite extensive for most Qt developers which don’t target very uncommon locales. KLocale is a more powerful localization system, and is part of the KDE platform which can be used without many extra dependencies than QtCore and gettext, which it extends.

Of course, KLocale and Qt localization systems are incompatible – They use different conventions in strings for plurals and different forms of catalogs, so a Qt developer has to pick one up front. For developers of libraries like Grantlee, that brings a need for abstraction. Grantlee now features an AbstractLocalizer which can be implemented to support any localization system. I’ve already implemented it for QLocale and for KLocale. That means that Grantlee templates can now be localized easily in both Qt applications and KDE applications.

We have a joke in the Grantlee community there are so many examples distributed in Grantlee for text editing, code generation etc that KDE is almost obsoleted (har-har). To test out the l10n feature I’ve added an address book application:

There are several important things to note here:

I suck at creating good looking html

Strings are translated

Strings with plural forms are translated

Numbers are localized properly. The German version uses comma for decimal separation.

Money is localized properly. The German version puts the Euro symbol on the right.

Dates are localized properly. The en_US version is wrong, and the German version uses dots for separators. The localization supports all of the common localization needs for application development.

Using _() to localize strings: {{ _(“Table of contents”) }}

Using _() to localize anything: {{ _(some_date_variable) }} where some_date_variable is a QDate outputs 14/3/2010 in en_GB, 3/14/2010 in en_US, and 14.3.2010 in de_DE. Numbers like {{ _(100001) }} are also localized properly.

Using i18n for argument substitution: {% i18n “Hello %1, Welcome to %2” username website_name %}

Using i18nc for disambiguated localization: {% i18nc “Name of a person” “Name” %}, {% i18nc “Name of a category” “Name” %}

Using i18np for plurals: {% i18np “%1 person” “%1 people” numPeople %}

Using i18ncp for disambiguated plurals {% i18ncp “The amount of people logged in” “%1 person” “%1 people” numPeople %}