Esperanto and Google Translate share the goal of helping people understand each other, this connection has been made even in this blog post. Therefore, we are very excited that we can now offer translation for this language as well.
The Google Translate team was actually surprised about the high quality of machine translation for Esperanto. As we know from many experiments, more training data (which in our case means more existing translations) tends to yield better translations. For Esperanto, the number of existing translations is comparatively small. German or Spanish, for example, have more than 100 times the data; other languages on which we focus our research efforts have similar amounts of data as Esperanto but don’t achieve comparable quality yet. Esperanto was constructed such that it is easy to learn for humans, and this seems to help automatic translation as well.
If there’s one thing I really love as a monolingual interested in language and translation, it’s a translator’s workflow. It’s exactly because I’m monolingual that I have no real idea how translators go about their jobs. I understand almost every other part of the system – the technology and how it all works together, how to trouble shoot installation, the difference between UTF8 and UTF16, but I don’t know the flow.
The blogspace at Witness has a great article on how they use a mixture of technologies and softwares to subtitle a video in their workflow. Recommended reading if you want to look at subtitling.
I particularly like this translation app review – no explanaition required, I presume. (via reddit)
I discovered that the premier free office software, Libre Office, was updated to version 3.5 recently. For those working with language, amongst the new features and fixes are a some Localisation improvements that justify an upgrade. If you are paying for a competitive office suite, I recommend you try Libre Office before spending the money at you next upgrade opportunity.
- Added Arabic, Aragonese, Belarusian, Bengali, Breton, Bulgarian, Scottish Gaelic, Greek, Gujarati, Hindi, Latvian, Brazilian Portuguese, European Portuguese, Sinhala, and Telugu spelling dictionaries. (Andras Timar)
- Use of possessive genitive case and/or partitive month names if provided by a locale’s locale data (e.g., Russian, Polish, Finnish, Lithuanian, and others).
If a day of month (D or DD) is present in a number formatter’s date format code, the month name for MMM or MMMM is displayed in possessive genitive case or partitive case.
Else if no day of month is present, the month name is displayed as noun / nominative case.
See blog for more details. (Eike Rathke)
- Corrections to Polish [pl-PL], Portuguese [pt-PT and pt-BR], Slovenian [sl-SI], and Latin [la-VA] locale data, esp. date formats. (Eike Rathke, Martin Srebotnjak, Mateusz Zasuwik, Olivier Hallot, Roman Eisele, Sérgio Marques)
- Initial support for two new UI languages, Luxembourgish (lb) and Tatar (tt)
LibreOffice 3.5 supports 107 UI languages.
Ok. So I’ve not announced anything here as yet, so I will now. In three weeks I am moving to the Pacific Island nation of Kiribati (pron: Kiri-bass), specifically the town of Bairiki on the island of Tarawa, for a year.
My whole family will be coming too – I am building a database for the Government, Amber is working in Marketing and Communications for the Kiribati Institute of Technology and our children are going to the local primary school. Both Amber and I are working with the AVID project.
This has been in the works since about October of last year, which hopefully explains my relative silence over the last two months, at least. Preparations are well underway, although the house is yet to be packed. Feel free to volunteer to help in this regard.
Further, given the nature of my work as a volunteer, but also the availability of the internet (or lack of), Pineapple Donut may well go on a small 12 month hiatus. I will potentially post a few things, but I certainly wont be doing quite as much as I have previously.
I absolutely plan on coming back to this project at the end of the assignment and look forward to reporting on tech/translation soon. I will be starting a blog about our experiences while we are away – I’ll be reporting on that very soon.
Boingboing points out a video of what they call How to say “I Love You” in 100 languages. This isn’t entirely true – I saw a couple of English and I think at least two Japanese – I presume other languages are over represented as well. Regardless, Valentines Day was a couple of days ago and it’s, well, lovely. While I’m here, I thought I’d add my own little message – Amber Carvan, I love you.
I used to work for EngageMedia and I’m pleased to see that they have just launched a subtitling project in conjunction with our friends over at Universal Subtitles. Headed up by Singaporean democratic activist Seelan Palay
The 600 million people spread across Southeast Asia share a common set of challenges: climate change, human rights, freedom of expression, corruption and much more. With hundreds of regional languages, communication and collaboration can be difficult. Translation and subtitling could always help, but now it’s a whole lot easier.
Let’s hope the number of subtitles increases and these important videos are seen by more people as a result.
In 2006 a Russian student emailed her postal address to a friend in France so that she could send her a Harry Potter book. Unfortunately the French friend’s email program was not set up to display Cyrillic characters; instead, it produced diacritics from the Western character set. Apparently not realizing the error, the French girl copied them down and mailed the package. Postal employees realized what had happened, deciphered the address, and delivered the book successfully.