i18n and Bash scripting

One of the first things a programmer learns, one of a sysadmin’s main tools and working environments is known as the Bourne Again Shell, or bash.

Bash is a low level programming environment, similar to DOS on the Windows OS. Bash comes with Mac OSX (Utilities->terminal) as well, but like DOS, is rarely used by most people. Within bash it is possible to do very complicated tasks very quickly by stringing together a collection of smaller tools like findingsorting, slicing and searching to automate the task at hand. The Flossmanuals have an excellent introduction to the command line (which is usually, but not always, bash) and author Neal Stephenson wrote an excellent piece called In the Beginning was the Command Line (it’s getting long in the tooth, but is well worth the effort).

Louis Iacona at the Linux Journal gives a great summary on how to Internationalise our bash scripts. As he notes, most other languages have well developed documentation regarding i18n/L10n – but not bash. He offers no reason why, although I would suggest it’s due to age and that most people use it for small repetitive tasks rather than complex software engineering.

The article begins with a rudimentary intro to i18n/L10n but the meat is further down, where he steps through the processes and commands that are used to i18n a bash script.

While it’s not really ground breaking, after 23 years, bash deserves an i18n howto. A great contribution.

More Django translation tips

Another great set of slides I found recently walks through some good solutions and tool for Django developers that need multilingual support.

Starting with right-to-left (“rtl”) they make excellent CSS based solutions that I presume are broadly applicable to any web design, and finishing with excellent description of some plugins/apps that can be used for multilingual namespaces.

The middle part doesn’t teach much if you have done some Django and i18n before, if you haven’t, it’s a good, brief overview.

i18n and the Django web framework

Recently I’ve been using the AustLit database which makes more sense, although the AustLit data entry system is significantly harder to use.

When I was asked for my opinion on the data entry aspect of this project my manager casually flipped out a comment about entering the data into a spreadsheet. My brain said ouch almost immediately – six researchers sharing a single document is a recipe for disaster; the xls/xlsx binary format doesn’t really fit nicely into version control systems in this case – we want the text diffs; and teaching Arts faculty researchers (who are very smart in their field) how to use version control is not an effective solution – it’s too complicated and they just wouldn’t use it.

So I looked into Django and whipped up a quick site (I will link to when I’ve got it on a beefier server) which now has over 2000 entries, 1200+ of which are translations in five languages: Spanish, Italian, Japanese, Simplified and Traditional Chinese.

Over my years of tech support, sysadmin and web dev I’ve often run into the i18n frustrations – it starts many years ago with mysql defaulting to the latin1 charset and the latin1_swedish_ci collation (thankfully this has now changed to utf8/utf8_unicode_ci) and goes right through to how do you divide the url namespace to account for the same page in a different language. These decisions can be hard.

I recently came across this article from multitasked.net about Django i18n which is an excellent, broad showcase of some of the problems you are more likely to come across when developing multilingual sites. It’s a very interesting perspective to write about translation from, and it doesn’t happen often enough. Unfortunately, I think Martin falls into the trap that so many devs do – forgetting to talk to his users, in this case translators – especially in regards to his design decisions about translated string length. Understandably, as it can be hard to engage with everyone that you need to when you have time, budgetary and managerial pressures.

Also, I think his advice “if you can avoid i18n in your project”, is misplaced. As a developer it is much easier to add i18n as you build, rather than retrospectively. In fact, retrospective i18n is one of the harder and frustrating (and dull!) tasks that devs have to deal with if they haven’t thought about it during the design stage. Personally, I would recommend that, as mysql learnt, the default should be “include i18n during your design phase”. If you don’t think you will ever need it, leave it out, but if you are unsure, build it in. Your future self will thank you.

A New Blog

Just what the world needs. Over the last 7 or 8 years I’ve been told a number of times that I should start a blog, but I’ve never really gotten one off the ground. There are a number of reasons for that, including basic laziness, but primarily I couldn’t differentiate the kind of stuff I would post from what thousands of others were posting on their blogs.

I think now that I can. Generally, this will be a clearing house of all of my interests. More specifically I will be focusing largely on Translation, internationalisation (i18n), and localisation (L10n).

I was a part of the Watercooler network on Ning but the recent decision by the admin to instigate a paywall didn’t sit easily for me. I liked Watercooler, and I have no ill will towards the community or the admins – I can totally understand their decision. I just couldn’t write for or be part of a gated community. If I was to write, which I had been doing more and more because of Watercooler, I would want everyone to have access to what I was writing, so I decided to start my own blog.

I hope you find it interesting enough to keep coming back. I hope I find it interesting enough to keep it going.