Setting up the content localization and, thus, configuring the interface language of the product in such a way that the right language is rendered to the right user is extremely important for each digital platform. That’s why we have decided to translate and share with you this expert article by Nicolai Goshin from Hellicht Medien.

And we strongly hope that some strategic points would be valuable for your localization projects!

Background and preliminary considerations

Digital projects targeting audiences in different countries or different language areas are doomed to take advantage of localization strategies. So we must answer the following question: which users should be given which content in which languages? The question at the first sight seems simple. But later in this article we will point out why this topic is, in fact, complex. And, of course, we will also address how to deal with this complexity.

Let's assume a scenario in which content (for example, an online magazine) is available in three languages: German, English, and Arabic. The goal is ideally to provide content to each user in their native language. If this is not possible, the content should be provided to the user in the language that they best understand apart from their mother tongue.

Please note, this is an expert article. In what follows, we will take a strategic and technical deep dive into the subject. So if you would like to stop reading now, there will be no hard feelings. Otherwise, prepare to buckle up: we are about to begin!

As we dig deeper into the topic of localization in this article, there are two mechanisms that we need to understand at the outset. The first one is the browser's language setting, and the second one is the IP of the user.

Browser language setting

Every time a website is queried, the web browser automatically sends to the server the browser language, which can be configured by the user via the browser settings. The default language is the language of the operating system. It is important to know that the majority of users are not aware that they are able to change the language. Each language typically consists of two parameters: the language itself and the region. Germany uses de-de, i.e., German-Germany, Austria uses de-at, meaning German-Austria, and the US uses en-us.

In addition, the user can specify a list of languages in their order of preference, such as, for example: en-us, en, de. In this case, the user's first choice is US region English, region-independent English is their secondary option, and region-independent German is their alternative least wanted.

User's IP address

The IP address ("IP" for short) is the user "address on the Internet". This is an assigned number that can be used to identify the user on the web and contains information about their location. For example, you can determine the country from which the visitor accesses the website by the IP address. This is possible because specific IP ranges are assigned to individual countries. For example, the IP addresses in the range between 2.16.240.0 and 2.16.255.255 are assigned to Germany. If a user has the IP address 2.16.265.100, we know that this person is accessing the Internet from Germany.

It should be noted that there are other methods that can be used to determine the user's location. However, we will omit them at this point, since they ultimately provide the same information as the IP address does.

Thus, we now know that there are two sources from which the information about the user's language or location (country) can be retrieved. At this point, we'll look at how we can use this information for localization, meaning adapting content to different languages.

Linguistic localization

The simplest and most common form of localization is a linguistic one, which is based on the browser's language setting. This method assumes that the user has set the desired language in their browser preferences.

In Germany, most users use de-de, de, and en. This combination implies that German-language content is preferred for Germany (de-de). If no such content is available on a particular website, German content from any other region will be used next, even if it is not specific to Germany (de). If no other German-language content is available, the final alternative of English (en) will be used.

In the scenario that we described in the introduction (an online magazine with German, English, and Arabic versions), all shoppers who have set their language code to de should receive the German-language content. In other words, these are all users whose primary language is de-de, de-at, de-ch, de, and so on.

For users who also understand English or Arabic, the situation is a bit more unconventional. While the German-speaking countries (which are gathered in the so-called DACH region) all border each other geographically, the same is not true of English- or Arabic-speaking countries. For example, English is spoken in the US, England, and Australia. In addition, English is the language that is best understood by people after their mother tongue in most parts of the world. This is why it is often specified as the secondary language in all the browser settings.

So, if we configured the website localization based solely on the browser language in our described scenario, users from the US and Australia would receive our English-language content. Users from Egypt and the United Arab Emirates would see Arabic content. So far, so good.

Disadvantages of localization based on the browser language setting

This type of language determination becomes problematic if the language set in the browser does not correspond to the user's native language. This can be the case, for example, when a German-speaking user works in Germany at an international company where the operating system and, by default, also the browser is set to English (en). This user would then see English-language content, even though their native language is German.

A similar problem arises in countries where the official language or language of business is generally English, but where the population speaks a different language. This is the case, for example, in countries such as the United Arab Emirates.

IP-based geographic localization

The disadvantages of linguistic localization are in part offset by IP-based localization. Under the latter method, the language is determined based on the country from which the user is accessing the Internet.

At first glance, IP-based localization seems like a watertight solution, because it resolves the above described case where a diverging language has been configured by the browser. Thus, with this method a user in Germany always receives German-language content, even if their browser is set, for example, to English as the primary language.

Disadvantages of IP-based localization

So, is IP-based localization a panacea? Whoever thinks so is wrong. The underlying assumption is that all users who are in a single country are native speakers of its language. And that is, of course, far from the reality. Someone who is in Germany, but speaks English only, for example, would see all web content in German, even though the site is also available in that person's native language.

Finally, IP-based localization ignores the browser language setting and is based on location exclusively. For example, we become overwhelmed with this flaw when we are surfing the Internet while on vacation and fail to see any content in the mother tongue. Instead, the web pages are rendered just in the language of the country where we are present.

Combined localization

In order to work out a more optimal solution, both outlined approaches can now be merged so that we can handle these borderline cases better. We mean those cases where we should not rely upon the IP address or the browser language solely. As previously described, this is valid for non-native speakers of the language in the country of stay and users with misconfigured browser language preferences.

And this is how we handle such cases:

We use IP localization as the primary criterion, i.e., we come out of the geographical location of the user, such as Germany, for example. Then we check if the determined location has also been set in the browser language settings. If there is a match, we display the content in the appropriate language. If the two data sources don’t match, then we will use the IP localization. The underlying assumption here is that a user from a given country is likely to have mastered the national language to some degree. Finally, we check if the content is available in other browser languages, too. If so, we show a popup (similar to a cookie notification) informing the user that the web page is also accessible in alternative languages they’ve listed in their browser settings. So that website visitors can then switch to another language or close the popup with a single click of the mouse. Cookies are used to find out whether the user has switched the language or dismissed the popup. And in the next session, the content will be displayed in the language of choice.

For example, a user accessing the Internet from Egypt but using a browser with German set as the primary language would see such a popup. The content would initially be displayed in Arabic. However, the user will all together see the following message in German: "This website is also available in German. Would you like to switch to the German version?".

We can now apply the same logic to the various alternative languages (languages that are displayed if the desired language is not available) by defining particular rules. Below is the result of such a combination.

IP location

Browser languages

Displayed content

Germany

DE-DE, DE, EN

DE

Germany

EN, FR, DE

DE

France

FR-FR, FR, EN

EN

United Arab Emirates

EN, AR-AMR

AR

United Arab Emirates

EN, DE

AR (with popup pointing to EN content)



Just to remind you, the website in our scenario is available in German (DE), English (EN), and Arabic (AR).

Access via Google

Another advantage of this differentiated method is that it allows you to better control website access via search engines such as Google, for instance. Search engines take into account the browser language and not necessarily the location of the user. A user who arrives at the site via a search engine is thereby always directed to the version that corresponds to the browser language even if there is a better match from the location (based on IP). The user can still switch to different relevant language content through the popup that has been described above.

Conclusion

The mix “content-language-user” must be kept in mind not only for the sake of usability or rather user experience, but also for marketing and strategy. Therefore, the above assignment has no claim to the absolute correctness – the decisive factor is the project-specific objective. Nevertheless, when considered both location and language (i.e., the IP address and browser language setting), the results are much better, since edge cases can also be handled correctly.

About the translator

This article has been translated by Alconost.

Alconost is a global provider of website localization services as well as the localization of applications, games, and videos into 70+ languages. Professional translation services for shorter texts are available with Nitro — an online translation platform.

We offer translations by native-speaking linguists, linguistic testing, cloud-based workflow, continuous localization, project management 24/7, and work with any format of string resources.

We also make advertising and educational videos and images, teasers, explainers, and trailers for Google Play and the App Store.