Globalizing a django project. How I launched boolino in the UK and other countries

In this post I will describe how I made it possible for a Spanish-based website to be replicated in other countries, each with their own domains and databases, while trying to minimize changes to the code and implementing the DRY (don’t repeat yourself) philosophy.

Specifications:

  • All of the features, models, views, etc. are the same. But that means that if there is any feature associated with a specific language or country (for example, natural language searches) this must be replicated in the new language.
  • Each country has its own database.
  • Each language has its own URL for SEO.
  • They all share the same templates, but as I said above, they have to include certain changes for each country.
  • Each country will have its own analytics and other IDs for 3rd party apps.
  • Boolino España is multi-language (Spanish, Catalonian and English). New countries only have one language.
  • Boolino España has to be moved to .es (it was .com) and boolino.com will be used for the website for the USA.

boolino.com (.es)

boolino España has one additional difficulty: because we treat it as a multi-language website, the user can choose the language for its menus and other static text. All of this is resolved magnificently by django’s i18n tool, but we have an additional hurdle to overcome. Our URLs contain the explicit name of the language, for example, http://www.boolino.es/ca/blogboolino/categoria/joc-i-lectura/ and http://www.boolino.es/es/blogboolino/categoria/juego-y-lectura/

Against this backdrop, let’s now take a look at the most interesting parts of the i18n process.

1. Settings, deployment and executing the website

This step is simple and it’s straight out of the best practices manual. The specific settings for each country are in separate files and when executing the website, you only have to specify the right file using the ‘settings’ parameter.

I run the deployment process with Fabric. Each country has its own operating system user so it’s easy to carry on using the same routes for the files. You only have to enter the username and host in the fabfile and it’s the same process for all of them.

2.- URLs

URLs are one of the most complex issues. In the same file, you have to consider that the URL may or may not be multi-language and also that the URL has to be translated for each language.

For the first problem, the same urls.py has to support these URLs:

http://www.boolino.es/es/libros-cuentos/un-dia-conmigo/

http://www.boolino.co.uk/kids-childrens-books/asterix-and-the-class-act/

In other words, with and without the language prefix. To resolve this I have used the django-solid-i18n-urls library and changed the settings for each country. So in the UK the prefix isn’t necessary, while in Spain it is.

The URL also has to be translated. Django also makes this much easier by allowing you to enter ugettext in the URL itself, so I have URLs like this

url(_(r'^libros-cuentos/'), include('boolino.urls')),

And they are automatically translated into the right language (using .po files).

3.- Structure of templates

We’ve said that the templates will be the same for each website, but there will be little differences between each of them. How can you inherit a template without having to change its name in the view? In 2 simple steps

Declare the templates in the settings:

TEMPLATE_DIRS = (
    os.path.join(PROJECT_HOME, 'templates/uk'),
    os.path.join(PROJECT_HOME, 'templates/es'),
    os.path.join(PROJECT_HOME, 'templates'),
)

I really only have 2 template directories: All of the complete templates are in templates/es and the few that have been changed are in templates/uk. And there’s not actually anything in templates/ (it’s the parent directory).

When I want to overwrite a template in the UK I simply give it the same name as the Spanish one (for example, templates/uk/libros/show.html) and its first line says:

{% extends "es/libros/show.html" %}

It then searches through all of the declared templates and will find it on the third line of the TEMPLATE_DIRS. This is the only way you can inherit a template from /es without having to completely overwrite it.

4.- ETL

I’ve created the file loading process by using ETL (extract, transform and load) logic. I’ll have a different data source in each country, so the Extract part will be new for each of them. The Transform part standardizes the data and, finally, the Load part is common to the entire project, because loading is the same with the data standardized.

5.- Translating all of the text

Boolino is already a pretty big project in terms of templates and code. I used django-lint to identify all of the strings that hadn’t been translated and it worked really well.

There is also some text in JavaScript that has to be translated. There are several ways of doing this; I decided to create a django view to render a template that is actually a series of js. variables.

{% load i18n %}
var ver_todos = "{% trans 'Ver todos' %}";
var ver_todas = "{% trans 'Ver todas' %}";
var ocultar = "{% trans 'Ocultar' %}";
var publicidad = "{% trans 'Publicidad' %}";

And in html I simply named the view as if it were just another .js.

<script src="{% url 'i18n.js' %}" type="text/javascript"></script>

6.- Migration from .com to .es

Until very recently, boolino España was .com. Due to the international nature of the project, Boolino España is now on .es and we’ve reserved .com for the USA. This is a simple change to make – 2 lines in the nginx to force a 301 for all of the .com traffic – but this is a vital moment to ensure that you do not lose all of the previous SEO or the incoming links and the PageRank of the old .com domain. It’s advisable to inform Google of the change of domain via Webmaster Tools. Hopefully it’ll work…

7. Re-launch of .com (USA) and keeping the old incoming links

We’ll soon be launching the website in the USA, the trickiest moment for the SEO of .es. Up until then, Google will simply see a 301 on all of the old incoming links and (in theory) it won’t affect the SEO. However, with the new .com I have to remove this nginx rule and Google could make a real mess of it, because all of the URLs from .com will be different to those from .es.

To try to resolve this, I’ve created a middleware to recognize the Spanish URLs and redirect the traffic to .es, if need be. This way, if .com receives a request with a URL in the Spanish format, the .com will not resolve the URL with pattern matching and, instead of returning a 404, it will return a 301 redirect to the .es domain, preserving the full path. In theory this will work; let’s see if Google likes it. Here’s the middleware code:

class OldDotComRedirectMiddleware(object):

    def process_response(self, request, response):
        if response.status_code != 404:
            return response # No need to check for a redirect for non-404 responses.

        try:
            current_url = request.resolver_match.url_name
        except Exception, ex:
            return http.HttpResponsePermanentRedirect('%s%s' % (settings.BOOLINO_URL_ES, request.get_full_path()))

        return response

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *