When visiting a website, one usually expects to see content presented in a language that's readable and understandable.
But as a content manager and marketeer in charge of achieving this, it's not that easy to ensure your target audiences see the right content in the right language from the very first visit – especially when serving content to multiple countries and in various languages.

Let's look at some key factors in deciding what your visitors should see when visiting your website from around the world – and understand from a hands-on use case, how it can technically be achieved to deliver the right content for the right visitors. As a plus, this will also help increase your website's Search Engine visibility.

Recap: important requirement to define when conceptually developing a global, multilingual website

In the first blog post of my series about building global multilinigual websites using Sitecore, I pointed out that you should conceptually answer the following questions – related to the topic of this post, namely delivering the right language content to the right visitors:

Which language/country is shown on a first visit?

How can a visitor change between pages in different language cultures? Or do we do no want this?

Will we use GeoIP localization for delivering the right content language to every visitor right from the first visit?

From a business perspective – e.g. serving your potential future clients with the right data, products and sales contacts (exemplary) – you should incorporate a decision on whether…

  • …A) it's more important to serve visitors content from the most appropriate market your business is operating in?
  • or B) if it's more important and advantageous if you serve your visitors these contents in their preferred language?

I mention this decision, because it will have a groundbreaking impact on how you build you website in terms of serving multilingual content and establishing market focused online representations of your business.

So in a nutshell and to close this recap, keep in mind:

You have to decide WHICH content you want your visitors to see (first) when visiting your website.

Challenges when trying to identify your visitor's origin, language- and country-settings

Nevertheless which way you decide on will be the right one to go – delivering market focused or visitor language focused content primarily – you will need some kind of information of your visitors in order to deliver them "the right" language content on a first visit. So what challenges are we facing, in doing so?

1) Using GeoIP localisation: complete and accurate data must be available

Sitecore Delivering Right Content in Right Language - GeoIP Accuracy IconMapping a visitor's IP can give you valuable information on his/her origin – on country but even on area or city scope. But to make this work accurately – and thus not confusing your visitors – make sure to have a solid, complete and always up-to-date IP2Geo resolving database in place!

A special challenge to overcome in getting a visitor's IP is when using a Content Delivery Network (CDN) with your website: make sure to pass and grab the correct IP address for mapping locations to (and not the CDN system's IP address ;-) ).

2) A visitor's browser region & language data is not always complete

Sitecore Delivering Right Content in Right Language - Security IconA common way to find out what language a visitor prefers is checking and parsing his/her "browser language". But be aware: this information might not always be “trustworthy” nor “complete” – web browsers like Firefox allow to easily modify this information; additionally the amount of such client information provided, by any web browser, has been continuously reduced over the last couple of years.

For example: you might also only get a language code – not complete information about preferred language AND country the visitor is coming from. Therefore it makes sense to combine this information with the visitor's IP2Geo resolving – see point #1.

3) The country- and language identification procedure should be configurable & flexible

Sitecore Delivering Right Content in Right Language - Configure IconI mentioned this requirement before, especially when we strive for best international targeting of a global website, therefore I mention it again: any automatisms regarding delivering content on your website should not be "locked in" aka "hard coded" – make sure to build a solution which is flexible to adjust on the fly. Once you will get your hands on first analytics results of what initial culture your visitors land on your website, you may see some room for improvements or adjustments. And you want these to happen fast.

4) Possibility to reproduce the country and language resolving procedure of a visit is extremely helpful

Sitecore Delivering Right Content in Right Language - Test IconAs followup to point #3, not only should the solution in place be easy to configure and maintain, but you should also have the possibility to reproduce and test various scenarios – by manually passing a certain IP address and browser language information when accessing your website – in order to understand the data you see in Analytics. And again: to help improve the mechanisms in place for continuously improved behaviour of the system.

Real world example: how to actually evaluate and deliver the right language content to every visitor?

When putting it all together, and to evaluate and validate with your business stakeholders, it makes sense to have something in place "visually". For example a simple procedural list, describing each step of a visitor identification on your website, based on the defined mechanisms.

💡You can grab a more detailed copy of the following procedure as Excel spreadsheet on my GitHub: namics/Sitecore - Visitor language resolving - Checklist.zip

The following is based on an example use case, where we want to identify a visitor’s country & language for delivering content from the appropriate target market.

Checks and steps Example use case
1. Sitecore default mechanisms: querystring sc_lang=en & url-path /en-US/ https://www.mysite.fi/ => no querystring & no url-path
2. Existing language Cookie https://www.mysite.fi/ => no existing language Cookie
3. GeoIP AreaCode + browser language combination GeoIP AreaCode: "FI" (Finland) Browser language: "en" (English) => "en-FI" => not applicable
4. Predictive 2-char AreaCode matching against existing languages in Sitecore 2-char AreaCode matching => "FI-FI" (Finnish-FInland) - SUCCESS
5. Predictive 2- or 5-char language & country matching of browser language not applicable anymore – step 4. matched already
6. Fallbacks: culture from Sitecore -config or "DefaultLanguage"-setting not applicable anymore – step 4. matched already

 

How language identification works from a technical perspective

So when adapting the mentioned challenges on a technical level, how do we actually get and process a visitor's information to serve the right content language? Here are a few examples to help doing so:

Browser language

Is passed by web browsers as the "Accept-Language" HTTP-header on every webpage request.
There are plenty of example implementations for working with this information, e.g. on GitHub.

GeoIP lookup

When working with the Sitecore Experience Platform, there are basically 2 recommended ways to map a visitor's IP to location information:

A) the Sitecore IP Geolocation Service
  • (+) Advantages: officially supported by Sitecore and – in the meantime – available for free as part of the Experience Platform.
  • (-) Disadvantages: cloud-based service, little information on the accuracy and up-to-date-ness of IP2Geo data.
B) GeoLite2 databases by MaxMind (free version or subscription based)
  • (+) Advantages: can be used free of charge (however: updated less often, low granularity, scope depends on the database used). On a purchase basis, it's updated weekly and you will get a full blown database with accurate data. You can also use their online service to lookup how certain IP addresses resolve using the MaxMind Geo2IP service: https://www.maxmind.com/en/geoip-demo
  • (-) Disadvantages: needs to be implemented and maintained by a software engineer. Updated databases need to be deployed to the live system regularly. You are relying on a third party, independently / not tailored to the Sitecore Experience Platform.

Sitecore's LanguageResolver

Rule of thumb here is: try to leave the original LanguageResolver as it is – properly enhance it with your custom code/procedures

💡You can find an example code for an enhanced Sitecore LanguageResolver implementation here on GitHub: merkle-sitecore-gists/BrowserLanguageResolver.cs

 

What's next?

In the next post of this series you will learn a few helpful suggestions to properly configure and report data in Sitecore Experience Analytics and the Campaign Manager when managing global multilingual websites with the Sitecore CMS.

Background story

Last October we – my colleague Fabian Geiger & myself – had the pleasure to hold a breakout session at the Sitecore Symposium 2018 in Orlando. This blog post is a summary and follow up from the knowledge shared for building globally focused websites based on the Sitecore Experience Platform, with multiple languages and country cultures. Consider this a non conclusive compilation of learnings and recommendations from our project work at Namics.