You can transcribe an audio file automatically with Python.

If you have an audio file with spoken words, the program will output a transcription of that audio file completely automatically.

This example uses English as input language for the audio file, but technically any language can be used as long as the speech recognition engine supports it.

Related course: Complete Python Programming Course & Exercises

Example

Start of by creating an audio file with some speech. This can be any audio file with English words. Save the file as transcript.mp3

If you are unsure where to get an spoken words audio file, you can use Bluemix to generate one.

Install prequisites

To run the app you need several things installed:

Python 3

the module pydub

the program ffmpeg

the module SpeechRecognition

You can install the Python modules with pip. ffmpeg can be installed with your package manager (apt-get, emerge, yum, pacman)

Transcribe

Audio transcription works by a few steps:

mp3 to wav conversion, loading the audio file, feeding the audio file to a speceh recongition system.

Copy the program below and save it as transcribe.py

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

import speech_recognition as sr

from os import path

from pydub import AudioSegment





sound = AudioSegment.from_mp3( "transcript.mp3" )

sound.export( "transcript.wav" , format= "wav" )







AUDIO_FILE = "transcript.wav"





r = sr.Recognizer()

with sr.AudioFile(AUDIO_FILE) as source:

audio = r.record(source)



print( "Transcription: " + r.recognize_google(audio))



Run the program with:

1

python3 transcribe.py



It will output the transcription of the original audio file.

Download audio examples