Last week I got invited to attend the Amazon Voxcon — Amazon’s conference about their voice technologies. It was the first of its kind and a big reason for the timing was the launch of Alexa in hindi.

The day long conference had various keynotes and workshops on design, tech and hardware for voice interfaces going on in parallel.

The equivalent of an app in the smartphone ecosystem is called skill in the alexa system. In naming it in such a way they have tried to make an inanimate technology sound more human like. Making Alexa sound more and more human like was also a big focus of the all the workshops.

Out of the three tracks for the workshops I found design workshop by Paul Cutsinger to be the most interesting. At Dost, I have been working with our voice product for the last one year and I was looking forward to see how Amazon designs its voice tech. The workshops introduced us to various design patterns in a voice interface and talked about how a flowchart kind of design doesn’t work in the voice land. I found this to be particularly interesting because we have always been taught to build software based on a flowchart design but in a voice app flowchart fails. In the voice ecosystem the top level should have all the information the user wants. For example, let’s say we are making a banking app and the user wants to see their bank account number. In a smartphone app you could have an interface where they go from menu > account details > account number. This puts the onus on the user to learn about the interface. On the other hand in a voice app user can just walk in and say “what is my account number”. For this to work all the information has to be available at the top level. This makes sense because in web and smartphone we have always been limited by the screen size and it is not possible to surface all the information at the top level. But there is no such limit in the voice ecosystem. In voice we use different kind of design method called storyboard. In a storyboard you imagine conversations users would have with your app, write down all those paths and from there we start building the app

One of the activities we did was to design an interface using a storyboard.

They also talked about various technologies that Amazon has built to make the voice sound more conversational and colloquial to the user.

Amazon had also invited a few organisations who had built products on top of their technology. One such was mybox, which has used Alexa to convert low-end set top boxes into smart set-top boxes. This was particularly interesting because it makes the voice tech much more accessible to the low end market as well.

Voice as an interface is the most natural and doesn’t require one to learn anything. It is supposed to be communal and conversational. It can really help break the barrier to access technology, specially in India where literacy levels aren’t so high.

A decade back smartphone entered the market with the same promises and with introduction of cheap internet it seemed that they were finally fulfilled. But they weren’t, at least not fully. The smartphone and subsequently internet’s reach has increased significantly in the last couple of years but not everyone has benefitted equally. Most of the low income homes usually have only one smartphone which is controlled by the men[1]. Even in cases where women have access to smartphone, they struggle to operate it. The current smartphone interface requires a steep learning curve for someone who hasn’t any technology before. All this has led to a 33% gender gap in mobile phone ownership in India[2].

Voice tech has promised to give people easy access to quality content. The skeptic in me wants this to be true and hopefully soon it would reach the people at the bottom of the pyramid too. Let’s hope that voice is the prince that was promised.

[1]: https://www.gsma.com/mobilefordevelopment/wp-content/uploads/2018/04/GSMA_The_Mobile_Gender_Gap_Report_2018_32pp_WEBv7.pdf

[2]: https://www.indiaspend.com/wide-gender-gap-in-mobile-phone-access-is-hurting-indias-women/