How to setup SpeechRecognition in Orange Pi Zero using python

We are living in the age where Deep learning is going through a transformation. Artificial intelligence is making its presence in every field may it be medicine or industry or media etc. It all narrows down to Machine learning , where  its possible to make the machine learn by itself. I wanted to do something interesting  in this topic. So I thought of making something with speech recognition.

Speech Recognition:

So what is speech recognition? It is the ability of a machine or program to identify words and phrases in spoken language and convert them to a machine-readable format. It means that when you speak, the software can recognise what you say and convert it to a textual information. For this project I planned to use Orange Pi Zero . Why Orange Pi Zero? Because  The Orange Pi Zero has an Interface board which has an on board microphone and 3.5mm audio jack. This makes it easier if you compare the other Pi boards where you need an extra hardware.


The hardware required for this project are as follows.

Getting started:

If you are looking for how to setup Orange Pi, then refer to the below links, skip them if you know them already.

Getting started with Orange Pi Zero

Introduction to Orange Pi Zero Interface board

How to flash new image for Orange Pi Zero board

We are going to use as our speech recognition framework. It works with both offline and online speech recognition. It supports the following engines.

  1. CMU Sphinx (works offline)
  2. Google Speech Recognition
  3. Google Cloud Speech API
  5. Microsoft Bing Voice Recognition
  6. Houndify API
  7. IBM Speech to Text


We are going to use the CMU Sphinx and Microsoft Bing Voice Recognition engine. We will install the python packages in a local path using virtualenv to keep the system python undisturbed.

apt-get install python-pip
apt-get  install virtualenv

If you want to know about virtualenv refer to this link.

virtualenv audiopy
source audiopy/bin/activate
pip  --no-cache-dir install SpeechRecognition

The reason I am using the --no-cache-dir is explained here.


apt-get install python-dev
apt-get install portaudio19-dev
pip install PyAudio

In order to access the microphone of the Orange Pi zero, you need the PyAudio.

apt-get install flac

Testing the SpeechRecognition

python -m speech_recognition

#!/usr/bin/env python3

# NOTE: this example requires PyAudio because it uses the Microphone class

import speech_recognition as sr

r = sr.Recognizer()
#r.energy_threshold = 500
with sr.Microphone(0) as source:
  print("Say something!")
  audio = r.listen(source)
  print("Processing !")

# recognize speech using Microsoft Bing Voice Recognition

# Enter your BING API Key here
BING_KEY = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" # Microsoft Bing Voice Recognition API keys 32-character lowercase hexadecimal strings
  speech_str = r.recognize_bing(audio, key=BING_KEY)
  print("Microsoft Bing Voice Recognition thinks you said " + speech_str)

except sr.UnknownValueError:
  print("Microsoft Bing Voice Recognition could not understand audio")
except sr.RequestError as e:
  print("Could not request results from Microsoft Bing Voice Recognition service; {0}".format(e))

7 thoughts on “How to setup SpeechRecognition in Orange Pi Zero using python

  • September 11, 2017 at 7:36 pm

    I’m newbie and have problem with this example. It doesn’t work with DietPi nor with – experimental (mainline kernel). Stable kernel doesn’t boot :(. I’m looked for any working img….

    • September 12, 2017 at 6:06 pm

      Sounds like a hardware defect

      • September 13, 2017 at 12:43 pm

        With stable kernel work just fine… exactly like in this tutorial. Problem is with DietPi and experimental kernel.

        • September 17, 2017 at 2:13 pm

          Ok, I am yet to try with Diet Pi and experimental kernel. I will update the post once I try them.

  • March 30, 2018 at 10:11 am

    how can i define a particular task for a voice command??? please help

    • March 31, 2018 at 3:02 pm

      from pyA20.gpio import port ;
      Is the above line there?

    • March 31, 2018 at 3:06 pm

      In this line
      speech_str = r.recognize_bing(audio, key=BING_KEY)

      You get the string version of what you say in speech_str variable. you can use a simple python string comparison to detect your commands.


Leave a Reply

Your email address will not be published. Required fields are marked *