We are living in the age where Deep learning is going through a transformation. Artificial intelligence is making its presence in every field may it be medicine or industry or media etc. It all narrows down to Machine learning , where its possible to make the machine learn by itself. I wanted to do something interesting in this topic. So I thought of making something with speech recognition.

Speech Recognition:

So what is speech recognition? It is the ability of a machine or program to identify words and phrases in spoken language and convert them to a machine-readable format. It means that when you speak, the software can recognise what you say and convert it to a textual information. For this project I planned to use Orange Pi Zero . Why Orange Pi Zero? Because The Orange Pi Zero has an Interface board which has an on board microphone and 3.5mm audio jack. This makes it easier if you compare the other Pi boards where you need an extra hardware.

Hardware:

The hardware required for this project are as follows.

https://www.banggood.com/Orange-Pi-Zero-H2-Quad-Core-Open-source-512MB-Development-Board-p-1110210.html?p=W214159476515201703B

https://www.banggood.com/Orange-Pi-Zero-Expansion-Board-Interface-Board-Development-Board-p-1115982.html?p=W214159476515201703B

Getting started:

If you are looking for how to setup Orange Pi, then refer to the below links, skip them if you know them already.

Getting started with Orange Pi Zero

Introduction to Orange Pi Zero Interface board

How to flash new image for Orange Pi Zero board

We are going to use https://pypi.python.org/pypi/SpeechRecognition/ as our speech recognition framework. It works with both offline and online speech recognition. It supports the following engines.

CMU Sphinx (works offline) Google Speech Recognition Google Cloud Speech API Wit.ai Microsoft Bing Voice Recognition Houndify API IBM Speech to Text

Code:

We are going to use the CMU Sphinx and Microsoft Bing Voice Recognition engine. We will install the python packages in a local path using virtualenv to keep the system python undisturbed.

apt-get install python-pip apt-get install virtualenv

If you want to know about virtualenv refer to this link.

virtualenv audiopy source audiopy/bin/activate

pip --no-cache-dir install SpeechRecognition

The reason I am using the --no-cache- dir is explained here.

apt-get install python-dev apt-get install portaudio19-dev pip install PyAudio

In order to access the microphone of the Orange Pi zero, you need the PyAudio.

apt-get install flac

Testing the SpeechRecognition

python -m speech_recognition

#!/usr/bin/env python3 # NOTE: this example requires PyAudio because it uses the Microphone class import speech_recognition as sr r = sr.Recognizer() #r.energy_threshold = 500 with sr.Microphone(0) as source: r.adjust_for_ambient_noise(source) print("Say something!") audio = r.listen(source) print("Processing !") # recognize speech using Microsoft Bing Voice Recognition # Enter your BING API Key here BING_KEY = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" # Microsoft Bing Voice Recognition API keys 32-character lowercase hexadecimal strings try: speech_str = r.recognize_bing(audio, key=BING_KEY) print("Microsoft Bing Voice Recognition thinks you said " + speech_str) except sr.UnknownValueError: print("Microsoft Bing Voice Recognition could not understand audio") except sr.RequestError as e: print("Could not request results from Microsoft Bing Voice Recognition service; {0}".format(e))