I was interested in trying out the Interactive Adventure Game Tool that Amazon released for building text adventure style games on Alexa. The skill that I ended up building with the tool is called Kitten Cafe, and you can have a play with that by saying “Alexa, open kitten cafe”, or by checking out the link on the amazon site:

https://www.amazon.com/Rogan-Josh-Kitten-Cafe/dp/B078HTZ66X/

The tool itself is on github here:

https://github.com/alexa/interactive-adventure-game-tool

The main thing I wanted to try was how to build a skill where the user was relatively free to say anything in order to explore the world of the game. In other skills, most notably The Magic Door, the user is presented with 2 to 3 options, and is prompted with the specific phrase that will pick the option. I was keen to see how it would feel to have an audio adventure where you are told “You are in a room, there is a load of bread here” and to see how people would then puzzle out what they were supposed to do to progress. I’ll try to say more about that when I’ve had a chance to see how more people use the skill, but in general I don’t think people are comfortable with that kind of approach, partly because it’s unlike their interactions with other alexa skills, but also because of the restricted amount of information developers get back about the user interactions at runtime.

Those problem aside I also wanted to see how quickly I could produce a complete skill using the tool without having to add too much custom code. What I’d like to write about first of all are the gotchas and tricks that I found out along the way.

Getting Started

To get started it’s worth mentioning that there are a few little bugs in the tool, some of which have already been fixed in the various pull requests that are waiting to join the main project. I’m not sure if anyone at Amazon is planning on looking at those pull requests, but there is really useful code in there. Just to get started on a Mac I needed to start from this one:

Bump require-dir to 0.3.2 by lazerwalker

https://github.com/alexa/interactive-adventure-game-tool/pull/43

With that in place the instructions in the README should get you up and running. If you’re on windows then you might want to take a look at this pull request:

Update readme with windows install instructions by Web-Cam

https://github.com/alexa/interactive-adventure-game-tool/pull/55

I also chose to bring in the code from the fork that the BBC started since it allows you to create link to existing scenes. This is pretty useful if you want to have multiple converging paths in the story meet up again down the line. You don’t need to use their version of the code, but it is a useful. The forked code is here, but you’ll still need the fix for the require-dir

https://github.com/bbc/interactive-adventure-game-tool

As I say, the instructions in the README are pretty good, so I won’t try to go over those again. I will just list off a few Gotchas that you should look out for:

Gotchas:

Utterances all have to be in lowercase.

You can create scenes that don’t themselves have any options, but offer the previous scenes options. This is useful if you just want to respond to something that the user said without moving the story on. Maybe they’re in a room and you want them to be able to say “Pick up the keys” but you just want to respond with “You don’t see any keys here” and just leave them with the same options they had before. The “Prompt with previous scenes” tickbox is great for this kind of interaction, but unfortunately there’s a bug in the code where the skill response will sometimes be an empty speech object. This is very frustrating because the only feedback you get when this occurs is silence.

The first utterance that launches a scene will be used to generate the name of an intent. You can then add more utterances to that scene (new line delimited), and they will also be added to same intent. There are some nice side effects of this, and also some really annoying quirks that can cause bugs. The nice side effect is that, if you want to frequently use the same set of utterances, for example “positive, yes, yeah, okay, sure, great” then you can define them on one scene and then reuse them in any other scene just by listing the first one (“positive”). Because of the way the intents are generated all of those utterances are valid for the “PositiveIntent”.

The downside of that is that if you aren’t careful when you’re adding multiple intents then you will start to have clashes with the utterances that you are using. For example, if you have a scene with the utterances “eat the food, eat the bread” and then later on you have a different scene with the utterances “eat the cheese, eat the food” then you are going to get two intents generated: “EatTheFoodIntent”, and “EatTheCheeseIntent” but they will both accept “eat the food” as a valid utterance and it just depends which of them is defined first as to which will be active at runtime. Also, when you come to get your skill certified amazon will reject it because “two or more intents in your skill include the same sample utterance(s).” (and they won’t tell you which two).

I’ll continue this in the future, but I just wanted to get some first impressions down. If you’re also using the tool then good luck. If anything in this wasn’t clear then do let me know. I’ll try to get in to the details of the code that the tool generates next time.