YouTube introduces auto captioning – 06/03/2010

Here I was boldly predicting that subtitling was the last great bastion of those that don’t want to end up as post-editors, and Google/Youtube announces automatic captioning of video content. As Gizmodo points out, this is definitely a step toward speech to speech translation. Check the three updates on the first link above for excellent points:

having more people come in and upload their correct versions of captions helps Google learn and improve their system faster, which helps all their speech-to-text services….
…watching the students from the California School For the Deaf talk about how the auto-captioning will improve their lives is kinda making me tear up.

Recently I was asked to help supervise and provide tech support for a student who wanted to take part in the TED Open Translation Project which is based on the DotSub subtitling project. Unfortunately, it would seem that it’s less a subtitling project, and more like a translation project – volunteers are provided with the source text to translate, rather than using a subtitle editor. I presume the slicing and dicing of content is done at a later date. Of course, this may be dependent upon skill levels in any particular language.

A confluence between these two projects leads to better captions and better subtitles, if the two groups share the information. I don’t think there has ever been a more urgent need to ask Google to open license it’s translation software and data sets. People will feel less confronted if the knowledge is available to all for free, than if it’s hidden away in their vaults. I also think translators will be more accepting of each new iteration of products and projects that they roll out, if they feel that they can be part of it, contribute without feeling ripped off.

Of course, that’s my opinion as a non translator – I expect translators will be less enthused.