Expedia released the first version of our Alexa Skill last December at the AWS re:Invent event.

Since then, a team of developers in the San Francisco and Bellevue offices have been hard at work adding features, refactoring and modernizing the code to adopt the new Amazon JS SDK.

The main challenge during the refactoring was: How can we make sure we are not breaking the already existing functionality? I.e. how do we avoid introducing regressions in our code?

The answer, of course, was “We need some tests” (yes, we didn’t have time to write tests for our first release, sorry TDD).

For unit tests we knew what we had to do. We settled for the standard Mocha with assertion and stubbing libraries solution given that we are developing in Node.

The problem was the Integration Tests (also called Functional Tests): Manual testing was not practical as a full pass of our test plan took anywhere between one and two hours of someone’s time when done manually. Thus, we had to search for libraries to streamline our functional testing.

After some research, we found BST bespoken tools — a set of tools to develop, test and deploy Alexa Skills — to be the closest to a golden standard at the moment. We gave it a try, but found a couple of drawbacks for our use cases. First, it is too heavy for our taste as we would probably not use many of its features. Second, it’s not easy to follow BDD practices out of the box, which we wanted for our tests.

Given that we couldn’t find a functional test library to fit our needs, we decided to create our own: alexa-conversation (Github repo)

Design and architecture

We had the following goals in mind when writing the alexa-conversation library:

Follow a conversational model of question-answer, like you would if you were testing the skill manually

of question-answer, like you would if you were testing the skill manually Support BDD practices as much as possible to make the tests accessible to non-technical stakeholders

Avoid reinventing the wheel

Make it easy to integrate with any CI/CD pipeline

with any CI/CD pipeline Use an easy, self-explanatory syntax

The result was a lightweight library which uses Mocha as a test runner. This library allows Alexa skill developers to write conversation-like functional tests by specifying intents (and slots) as inputs and executing assertions against their skill’s response . All this happens without having to start any server or proxy and being able to run the tests in any node environment.

Using the library

First, install it:

$> npm install --save-dev alexa-conversation

Also install Mocha if you haven’t yet:

$> npm install --save-dev mocha

Here is an example of how easily you can define integration tests with this framework for your Alexa Skill:

To use this library you need to have Mocha installed, either globally or locally. If you have a global installation of Mocha you can just run tests with:

$> mocha test-conversation.js

Or, if you prefer a local installation of Mocha, you can use npm’s package.json file to define a script that will execute all your functional tests under a certain folder ( ./funtests in our case). Npm will make the local version of mocha available in the path so you can just add this to your package.json :

"scripts":{

"funtests": "mocha --recursive ./funtests"

}

And run it like this:

$> npm run funtests

Once the execution finishes, the process will exit with 0 status code if all the tests were run correctly or with >0 if there were any errors, following the UNIX standard. This makes it easily pluggable in any already-existing pipeline.

Drawbacks

I want to note that even when using this library, manual testing is still highly recommended (if not necessary) to guarantee the quality of your skill, as the only way to test how Alexa is matching the user’s input (voice or text) with your intents is through the Amazon Developer console or testing on a real device.

Another drawback you might face is the fact that depending on how you build your outputSpeech you might have a hard time making sense out of the spaces between words. To solve that we introduced fixSpaces as an instantiation option to the conversation object, but be advised that it’s far from perfect.

Finally, if your output contains variable phrases (such as dates or time), the testing framework might produce false negatives with a hard coded variable. We produced a work-around by allowing you to compare the output using Regular Expressions to define any necessary wildcards on these variables.

Feedback and contributions

This is a very early version of the library. There is a huge room for improvement so we would love to get some feedback and contributions from other Alexa Skill developers.

Please head to the issues section of the repo if you want to leave some ideas or comments.

This is a cross-post from the Expedia Engineering Blog