Posted by Aza Tulepbergenov, Developer Programs Engineer

Testing is an important part of any development process. Without testing, you risk releasing code that results in a frustrating user experience.

Actions on Google are no different. It’s crucial to test all the pieces of your Action to make sure your users will have success and want to come back to your Action.

There are three main components that need to be tested:

Natural language understanding (NLU), which is how we know what the user wants, or their intent

Intent handling logic, such as your webhook code that implements response building

Business logic, such as talking to a database or making external API calls

Testing Actions on Google is a tricky problem because developing an Action spans multiple platforms.

Without test tools, I’ve often had to look through logs just to find out that an intent name I used in Dialogflow mismatched the name I specified in the webhook implementation. In one instance, when trying to figure out why my Action wasn’t working, I realized that the intent I had specified in Dialogflow (“play super fun cats game”) was different than the name I’d put in my webhook (“super cool cats game”). Another time, I just wanted to test a specific intent handler, and my testing process was to play through the conversation and manually observe behavior in the Actions simulator, since there is no direct way to trigger a specific intent.

We’ve heard from developers that the testing experience can be better, so we at Developer Relations have been looking into how to make the testing experience better for developers, and recently we added some testing best practices to our documentation. These summarize our collective findings and are written to help you test your Actions.

The main insight we discovered is that it’s easier to think through your Action if you look at each of the programming layers separately: natural language understanding, intent handling and business logic. This provides a nice mental framework to partition your Action into separate Subjects Under Test (SUT). The diagram below illustrates the layered model.

Testing “Facts about Google”

At Google, we’ve used this mental framework for a few projects already, and in this post I’ll describe my thinking process I used when testing Facts about Google. Despite being an example without significant business logic, “Facts about Google” has non-trivial requirements for NLU: it needs to parse custom entities from user speech.

These custom entities play a crucial role in the Action’s behavior as the Action takes different conversational directions based on those entities. The webhook code has similar requirements and returns complex responses, which are important to how the Action behaves. Those requirements were our base testing requirements.

Dialogflow

First, I’ll take a look at how the testing requirements are implemented for Dialogflow. Because the Action uses Dialogflow to implement NLU, the tests can leverage an API to test the requirements in the Dialogflow fulfillment. Consider the following snippet that uses Dialogflow’s detectIntent method:

test.serial(‘choose_fact’, async function(t) { const resJson = await dialogflow.detectIntent( ‘Tell me about the history of Google’); expect(resJson.queryResult).to.include.deep.keys(‘parameters’); // check that Dialogflow extracted required entities from the query. expect(resJson.queryResult.parameters).to.deep.equal({ ‘category’: ‘history’, // put any other parameters you wish were extracted }); expect(resJson.queryResult.intent.displayName).to.equal(‘choose_fact’); t.pass(); });

This test case asserts that Dialogflow correctly matches a query to an intent and extracts the correct entities (here, entity “category” has value “history”).

Aside: If you’re curious or want additional context, you can refer to df-test.js for the full source code.

I found it useful to map each Dialogflow intent to a test handler (in the snippet above, this is test.serial), which includes assertions applicable for that intent. I recommend testing your NLU for the following:

Setting entities correctly

Setting contexts correctly

Matching difficult queries correctly

“Facts about Google” uses Dialogflow as an implementation of the NLU layer. However, if you’re an Actions SDK developer, you can apply our recommendations to NLU implementation of your choice if it follows similar structured data format.

Webhook

Your webhook plays an important role in controlling a good user experience: it is responsible for conversational responses your Action returns and controls the flow of the conversation.

Facts about Google returns complex responses that include suggestion chips, cards, and text responses. From Google’s conversational design guidelines, it’s known how important it is to incorporate visual responses to better guide the user through conversation. Hence, suggestion chips are an important piece of the Action.

Additionally, one of the common bugs among Actions on Google developers is a result of programmer’s misusing client library by mixing up conv.ask and conv.close. In the snippet below, we test both of those pieces.

The snippet below illustrates how we would test for those bugs. In the code, expect(jsonRes.payload.google.richResponse.suggestions).to.have.deep.members checks that suggestion chips are present and expect(jsonRes.payload.google.expectUserResponse).to.be.true checks that your Action doesn’t close the mic in the middle of conversation with the user.

test.serial(‘yes-history’, async function(t) { const jsonRes = await getAppResponse(‘yes-history’); expect(jsonRes.payload).to.have.deep.keys(‘google’); expect(jsonRes.payload.google.expectUserResponse).to.be.true; expect(jsonRes.payload.google.richResponse.items).to.have.lengthOf(3); expect(jsonRes.payload.google.richResponse.suggestions).to.have .deep.members([ {‘title’: ‘Sure’}, {‘title’: ‘No thanks’}, ]); t.pass(); });

The most important piece of the snippet is getAppResponse function that sends a synthetic payload to an instance of your Dialogflow app and receives a response. This response is used as the main SUT. I encourage you to take a look at the detailed implementation in the official repo.

Integration

The tests I wrote for our NLU and business logic layers give us a certain confidence that my Action works as expected, because each unit is tested. However, to boost my confidence even more, I decided to implement an integration test that checks how those two units work together

A good way to come up with a test case is to look at the unit tests done for Dialogflow and webhook, and combine the scenarios. For example, the snippet below combines the test cases we did for Dialogflow and webhook:

test.serial(‘tell me about cats’, async function(t) { const jsonRes = await dialogflow.detectIntent( ‘Tell me about cats’ ); const payload = jsonRes.queryResult.webhookPayload; expect(payload).to.have.deep.keys(‘google’); expect(payload.google.expectUserResponse).to.be.true; expect(payload.google.richResponse.items) .to.have.lengthOf(3); expect(payload.google.richResponse.suggestions).to.have .deep.members([ {‘title’: ‘Sure’}, {‘title’: ‘No thanks’}, ]); t.pass(); })

To recap, we used the methodology described in the Testing Best Practices page of the Actions on Google documentation to provide test coverage for “Facts About Google”. One of the goals I had when writing this post was to give insight into my thinking process when coming up with test cases — I encourage you to apply similar processes in your development and provide robust coverage for your Actions.

Thanks for reading! To share your thoughts or questions, join us on Reddit at r/GoogleAssistantDev.

Follow @ActionsOnGoogle on Twitter for more of our team’s updates, and tweet using #AoGDevs to share what you’re working on. Can’t wait to see what you build!