Voice commands are the future. Science-fiction has had them for decades and yet, we still have reach for the remote to turn on the TV or set an alarm. Our mission is to change this. Adding a voice interface to an app or device should be simple.

Turning speech into actionable data

Today, we’re very excited to announce our new “Speech to JSON” API, four months after the launch of the “Text to JSON” API.

From now on, your app, device or even your website can stream audio to our server, and get actionable data in return.

See it in action for home automation:

How does it work?

Behind the scene, Wit combines various state-of-the-art Natural Language Processing techniques and several speech recognition engines in order to achieve low latency and high robustness to both surrounding noise and paraphrastic variations (there are millions of ways to say the same thing).

Fortunately, you don’t need to care about all this machinery. We focus all our energy into creating the simplest developer experience possible. You can be up and running in a few minutes using our website. Wit will adapt to your domain over time, from ice-cream distribution to space missions. Wit makes no assumptions and remains 100% configurable.

It will take you 5 minutes to build your own Wit configuration:

Consuming the API

Then, calling the API is simple. We provide client-side SDKs that handle audio recording and streaming for iOS, Android or even a simple webpage like this one. You can also use the HTTP interface to stream live audio or post a sound file:

Let’s take this sound (recorded from a celebrity in the valley — do you know who?):

Submit it to the Wit API with a POST request:

curl -XPOST 'https://api.wit.ai/speech' \

-i -L \

-H "Authorization: Bearer $TOKEN" \

-H "Content-Type: audio/wav" \

--data-binary "@sample.wav"

You’ll get this in return:

{

"msg_id" : "6a84eae3-969c-41ad-94d9-85076fbbdc99",

"msg_body" : "set the kitchen table on fire",

"outcome" : {

"intent" : "set_fire",

"entities" : {

"object" : {

"value" : "kitchen table",

"body" : "kitchen table"

}

},

"confidence" : 0.997

}

}

Interested to build your own voice interface? Sign up here!

Team Wit

@WitNL