The new age of cyberlinguistics

The Conversation has a post about a new smart phone app that makes collecting, saving and interpreting languages significantly easier.

Recording the world’s vanishing voices expounds the developments of Steven Bird from the University of Melbourne’s Language Technology Group:

Of the 7,000 languages spoken on the planet, Tembé is at the small end with just 150 speakers left. In a few days, I will head into the Brazilian Amazon to record Tembé – via specially-designed technology – for posterity. Welcome to the world of cyberlinguistics.

Our new Android app Aikuma is still in the prototype stage. But it will dramatically speed up the process of collecting and preserving oral literature from endangered languages

Primarily developed for ‘saving’ languages the last speakers of which are dying out, difficulties included the mandatory informed consent for recording voices from people that have little to no contact with, or understanding of, computers or the internet.

Participants will try out the latest version which includes voice-­activated translation: while listening to a recording, the user can interrupt to give a simultaneous interpretation of the recording in another language (in this case, Portuguese).

This interpretation is captured by the phone and linked back to the original recording, phrase by phrase. In this way, the collected recordings are guaranteed to be interpretable even once the language is no longer spoken. This interpretability is what gives the recordings their archival value.

All materials we collect in this way will be left for the community and also lodged with the Museu Goeldi, a local research centre where they will be permanently available to the community.

That the application itself allows for almost simultaneous interpreting, greatly enhances the value of the collected data:

If enough people use Aikuma we will accumulate a large number of recordings from the world’s small languages, including Usarufa and Tembé. The result promises to be a digital-audio Rosetta Stone.

With permission, we will store the recordings and translations in the Internet Archive, a digital repository that has been preserving snapshots of the web since its inception in the early 1990s, and which is the most credible place to store digital content in perpetuity.

Cyberlinguists of the future may be able to discover the words and structures of dead languages from this data, and even construct dictionaries and grammars.

Leave a Reply

Your email address will not be published. Required fields are marked *

5 × four =