After recording last week’s interview, I was left with a 36-minute MP3 and a profound feeling of dread. You see, I hate transcribing audio. I used to transcribe interviews in high school, and it’s always tedious, taking upwards of eight times the length of the clip itself.

Bracing for a good four or five hours of rewinding and writing and rewinding, I remembered that this is The Future! So, instead, I tossed the job over to the global anonymous workforce at Amazon Mechanical Turk instead.

The result: my 36-minute recording was transcribed while I slept, in less than three hours, for a grand total of $15.40.

This is a fraction of the cost/time of any other transcription service online, including the Turk-driven Casting Words, though you potentially sacrifice some quality. In my experience, though, there were virtually no errors.

Here’s how to do it yourself, with no programming knowledge required. The instructions below are verbose, but using my template, it shouldn’t take you more than five minutes of setup per job.



Step 1: Prepare your audio.

First, I split my 35-minute audio file into seven five-minute MP3s. Why? Mechanical Turk workers are all working in parallel, so the more discrete tasks, the faster the job gets done. This also diminishes the risk of one bad worker ruining your whole job. (Though you’re always allowed to reject bad submissions, and you’ll never have to pay for those.)

I used the open-source Audacity to split the files, but you could just as easily use any audio utility or editing software. Optionally, you might want to make each clip overlap by a few seconds, so you’ll be able to easily recognize where each segment of the transcript starts and stops.

Name the files sequentially. In my case, they were interview_1.mp3 through interview_7.mp3. When you’re done, upload the files somewhere they can be downloaded publicly. You’ll need the full URLs later.

Step 2: Design your HIT template.

Mechanical Turk jobs are called HITs — short for the dystopic-sounding “Human Intelligence Tasks.” After you’ve signed up as a new Requester on Mechanical Turk, you can design a new template from the homepage using one of several samples. Choose the Default Template.

On the Properties screen, we’ll write a short description of the task, define how many people we want to work on it, and how much we’re willing to pay them.

For a five-minute MP3, I think allotting two hours per assignment is ample time, and I expired the entire HIT in 12 hours because I was in a hurry. As for pay rate, you’ll need to determine the “Reward per Assignment” based on the difficulty of the task and what you think is fair. I chose $2.00 per five-minute MP3, or about $0.40/minute. Depending on the difficulty, you might want to try going higher or lower.

I only wanted one worker to attempt each clip, so I changed the “number of assignments per HIT” to 1. (If you want redundant transcripts for each clip, change this to 2 or 3… But be aware your costs will double or triple!)

After entering all this information, here’s what my finished Properties screen looked like:

On the Design Layout screen, you design the template that gets displayed to each worker, using basic variables that will be substituted later. For this template, we make up only one variable named “$url.” You can call it anything you like.

The basics you’ll need are a title, some simple rules, the link to the audio file with a substitution variable, and a text form for the worker to type the transcription into. If you’d like to use my template HTML, here it is. (Make sure you change the path to your own audio files!)

Two things to notice in my example. First, the “${url}” variables will be substituted with values in the “url” column of the spreadsheet we’ll create in the next step. Second, any form element you create will end up in your final output from Mechanical Turk, so don’t worry about the naming. I called mine simply “transcription.” Here’s what the relevant part looks like in the final template:

Please transcribe this five-minute MP3: <a href=“${url}“>${url}</a> Enter your transcription below: <textarea name=”transcription” cols=”80″ rows=”30″></textarea>

For the worker’s convenience, I also added an embedded Flash player for the MP3, but this is entirely optional. When you’re done designing your template, it should look something like this:

On the next screen, make sure it looks the way you like, and click “Preview and Finish” to save the HIT template.

Step 4: Upload the data for your HITs.

Once we’re done designing our template, we can select it to create a new HIT batch. We’ll be creating a simple comma-separated file (.CSV) filled with the data that will be substituted into our template.

On the Publish tab, select the template you just created by clicking the “Select” button:

Now, Amazon generates a sample CSV for you to put the URLs to your MP3s in. Click the link to “Download a sample input file” and open the downloaded CSV in a text editor. If you’ve done everything right, it should look like this:

url Hit1_url_data Hit2_url_data Hit3_url_data

Replace the “Hit1_url_data” lines with the full URLs to your own MP3 files. For me, this looked something like:

url http://waxy.org/temp/phonecall_1.mp3 http://waxy.org/temp/phonecall_2.mp3 http://waxy.org/temp/phonecall_3.mp3

And so on. Save the CSV file, and upload it to Amazon. When you’re done, your uploaded file should appear, with the number of input lines.

Step 5: Publish your HITs.

Select your uploaded input file, and preview the finished batch of HITs. You’ll be able to page through each HIT, seeing exactly what workers will see. Use this opportunity to test that your audio files can be downloaded and heard properly. If it all looks good, click “Next” to confirm and publish your batch. This is what the final screen looks like:

If you don’t have any money in your Amazon Payments account, you’ll be prompted to fund it with a credit card. After you’ve paid, click “Publish HITs” and you’re done!

Your HITs will publish out to the Mechanical Turk workers, who will find and work on your task. Depending on the length and number of your MP3s, expect some work back within an hour.

As they’re working, you can browse and approve the results. The final output is an exported CSV, a spreadsheet of all the finished work that can be opened in Excel for your review.

Conclusion

You’d be insane not to use this for your own transcription projects. Absolutely nothing else comes close in price and speed.

One thought: I suspect it’d get even faster if you split clips into more pieces. I’d bet that splitting into one-minute segments would reduce the time by at least half. I’ll bet you’d be able to command lower rates with smaller MP3s too, since the time commitment would be lower, driving more competition for the tasks. If anyone experiments along these lines, please let me know!