I eagerly await the day that computerized transcription becomes cheap and reliable enough that human-transcribed audio disappears. Google is getting close. I have my reasons, but succinctly: podcasts need to be transcribed to be searched by text-search engines.
In the meantime, tools for human transcription can make life slightly easier. I read a Boing Boing review of a typical solid-state recorder, which led me to the existence of Listen & Type, a Mac transcription application so simple it barely seems worth paying for unless you need it, in which case it's worth far more than its $20 asking price. And it's an app whose shareware demo is a nearly mandatory sales pitch: I didn't understand the app at all until I tried using it to transcribe a short, hard-to-understand video.
A video which, fortuitously enough, is hosted on YouTube, which means by transcribing it, I can take advantage of Google's new semi-auto-captioning feature, whereby I provide the transcript, and Google generates appropriate timecodes and publishes the result.
It took me a fair amount of time to write the transcript out, but Listen & Type definitely sped up the process a ton. The fact that it lets you run back and forth through your media file with keyboard shortcuts while keeping your text editor as your foreground app is, well, that's the whole of the app, but that's exactly what you need. The learning curve took about 5 minutes.
You'll notice that the captions in this video have name tags, among other inconsistencies. The time-sync is also not perfect. But they're good enough. For comparison, I'll attach (see below) both my original transcription and the downloaded Google version, with synthetically-generated timecodes. But here's an excerpt:
Original
AL: It's Al! And how are you?
R: Not too bad. So what category are ya, Al
AL: Category? Oh I don't ride...I just came across this group of guys and I knocked one of them off and stole his jersey?
R: So state your name and category, and who you think should win the crash prize this year.
Googleized with timecodes
0:00:41.969,0:00:44.219
AL: It's Al! And how are you?
0:00:44.219,0:00:47.059
R: Not too bad. So what category are ya, Al
0:00:47.059,0:00:51.000
AL: Category? Oh I don't ride...I just came across this group of guys and I knocked one
0:00:51.000,0:00:52.870
of them off and stole his jersey?
0:00:52.870,0:00:59.870
R: So state your name and category, and who you think should win the crash prize this
0:01:01.249,0:01:01.899
year.
I'm pretty keen on this. Refinement of the Google timecodes would be easy enough: move "year" back onto the previous line, tweak a few other timecodes by a second or two, but it's still much easier than starting from scratch.
Comments
Post new comment