YouTube’s Audio Transcription

Given the subject of my last post was Google and voice recognition, I thought that this video is timely. Titled Caption Fail 2 (the original Caption Fail is also available) it uses YouTube’s Auto Transcription mechanism to “play Telephone” (aka Chinese Whispers) – the game in which a message is passed from person to person in serial, and is transformed into something completely new – and often surreal – just like a bad machine translation.

Personally I believe that this game is always affected by the fact that the subject’ss knowledge of the study or outcome causes them to alter their behaviour – in this case, to deliberately change what they have heard (like the Hawthorne Effect, but not quite). While YouTube isn’t doing this to the performers, Rhett and Link, they have obviously chosen scripts that are a mouth full in order to trip the software up. I wonder how many takes they needed to shoot and script re-writes it took to get a sufficiently entertaining result?