i18n and the Django web framework

Recently I’ve been using the AustLit database which makes more sense, although the AustLit data entry system is significantly harder to use.

When I was asked for my opinion on the data entry aspect of this project my manager casually flipped out a comment about entering the data into a spreadsheet. My brain said ouch almost immediately – six researchers sharing a single document is a recipe for disaster; the xls/xlsx binary format doesn’t really fit nicely into version control systems in this case – we want the text diffs; and teaching Arts faculty researchers (who are very smart in their field) how to use version control is not an effective solution – it’s too complicated and they just wouldn’t use it.

So I looked into Django and whipped up a quick site (I will link to when I’ve got it on a beefier server) which now has over 2000 entries, 1200+ of which are translations in five languages: Spanish, Italian, Japanese, Simplified and Traditional Chinese.

Over my years of tech support, sysadmin and web dev I’ve often run into the i18n frustrations – it starts many years ago with mysql defaulting to the latin1 charset and the latin1_swedish_ci collation (thankfully this has now changed to utf8/utf8_unicode_ci) and goes right through to how do you divide the url namespace to account for the same page in a different language. These decisions can be hard.

I recently came across this article from multitasked.net about Django i18n which is an excellent, broad showcase of some of the problems you are more likely to come across when developing multilingual sites. It’s a very interesting perspective to write about translation from, and it doesn’t happen often enough. Unfortunately, I think Martin falls into the trap that so many devs do – forgetting to talk to his users, in this case translators – especially in regards to his design decisions about translated string length. Understandably, as it can be hard to engage with everyone that you need to when you have time, budgetary and managerial pressures.

Also, I think his advice “if you can avoid i18n in your project”, is misplaced. As a developer it is much easier to add i18n as you build, rather than retrospectively. In fact, retrospective i18n is one of the harder and frustrating (and dull!) tasks that devs have to deal with if they haven’t thought about it during the design stage. Personally, I would recommend that, as mysql learnt, the default should be “include i18n during your design phase”. If you don’t think you will ever need it, leave it out, but if you are unsure, build it in. Your future self will thank you.