YouTube’s Audio Transcription

Given the subject of my last post was Google and voice recognition, I thought that this video is timely. Titled Caption Fail 2 (the original Caption Fail is also available) it uses YouTube’s Auto Transcription mechanism to “play Telephone” (aka Chinese Whispers) – the game in which a message is passed from person to person in serial, and is transformed into something completely new – and often surreal – just like a bad machine translation.

Personally I believe that this game is always affected by the fact that the subject’ss knowledge of the study or outcome causes them to alter their behaviour – in this case, to deliberately change what they have heard (like the Hawthorne Effect, but not quite). While YouTube isn’t doing this to the performers, Rhett and Link, they have obviously chosen scripts that are a mouth full in order to trip the software up. I wonder how many takes they needed to shoot and script re-writes it took to get a sufficiently entertaining result?

One thought on “YouTube’s Audio Transcription

  1. Youtube audio transcription is not at all accurate. This voice recognition technology is not matured yet. For accurate youtube audio transcription better to go for manual transcription

Leave a Reply to Kannan Cancel reply

Your email address will not be published. Required fields are marked *

three × four =