…or: What does it mean to be writing interactive fiction?

When asked about outreach prospects for IF at PAX East, I said this:

We have a two-part accessibility problem.

One part is the interpreter: people don’t want to download separate files and don’t want to have to figure out file formats. That structure is unattractive and increasingly out of step with the way casual players play games — and especially with the way that they’re persuaded to try new work. We (as a community) are working on that by developing better browser-based interpreters and making it easier for people to publish material to websites. It’s not true that there’s been no Java Glulx terp at all, but it required its own downloading and does not offer the option of creating an attractive game display within a browser window. Just the last couple of months have seen major strides on this front, with both Quixe and ZMPP reaching the point where they can play Glulx games in a browser window. Zifmia is a project to present games from a server interpreter, while FyreVM is an experiment in letting authors customize their output with channel IO. For TADS 2 there is Jetty, and Mike Roberts is actively working on changes to TADS 3 that would make it possible to do web service of those games. So we’re making a lot of progress here.

The other problem is the parser. When you look at novice reactions to IF — found in responses to IF games posted on indie gaming sites, or in student reactions to playing IF for the first time — the initial reaction is often enraged frustration with the parser. The first few (or few dozen) moves of a new player’s interaction with the game often consists of many many failed attempts that do not move the game forward in any way.

This is alien to most gamers these days. These days, even fairly difficult console games usually guarantee that at the beginning of the experience it’s just about impossible for the player to do something wrong or to fail meaningfully. Interaction options are introduced gradually. By contrast, most IF games are not designed with any kind of tutorial mode or game-opening section, instead offering (at best) a lengthy menu of instructions. There are exceptions (Dreamhold, Blue Lacuna). My own recent games have included an optional tutorial mode (which I think of like training wheels) that give turn-by-turn contextual advice to the player based on what’s currently happening.

It’s not clear to me how well those games have worked in attracting novices and making them comfortable with IF, however. (I just don’t know: I’d love to hear about it if, e.g., there were a bunch of Blue Lacuna players who got acclimatized to IF through that approach.)

Fundamentally, however, we’ve got a bigger problem, which is that the command prompt is a lie. It tells the player “type something, and I’ll understand you.” Which it won’t.

It wouldn’t necessarily make for better games to have a looser parser.

Adrift is partly wild-card based, allowing more keyword-based parsing than TADS or Inform use by default, but the result is often comical misunderstanding when the game fails to consider important elements. More advanced natural language processing approaches have tended to disappoint their users by not accomplishing what’s actually desired or by making it very hard to tell how the player’s input is affecting the results (Starship Titanic, Façade). Brian Moriarty and I (and various other community members) argued about the need for NLP for interactive fiction a few months ago. Since then my own ideas about it have been shifting a bit, though it’s not the case (as Moriarty claims) that the IF community isn’t interested in parser improvements, and I still mostly agree with what I said about the failings of NLP as an interface for games; of course people are nonetheless trying. Even a parser that is genuinely better at understanding input creates some significant problems for a game designer, if it starts to accept adverbs or statements that would require more complete world modeling than the game otherwise needs.

This is why parser improvement work has mostly focused on a few areas: better guessing at what the player means when the useful information does exist in the world model (so we’re not asking the player whether he wants to open the door with the black key or the black forest cake); better identification of player errors so we can offer more guided error messages. Aaron Reed has done some research in this area, and offered his correctives for Inform at least as a set of extensions duplicating much of the behavior he implemented into Blue Lacuna.

But at the end of the day, I agree with Mike Roberts that the trick isn’t to make the parser understand whatever a novice might type, and that the average novice user would actually be happier with a smaller vocabulary that has been spelled out in full.

It’s a matter of making the game better at communicating to the player what kinds of things are valid actions in the first place — indicating the affordances of the system, in other words.

That will also help with the other problem that novices often report: a kind of paralysis of choice. If you can do anything at the command prompt, where do you start?

Yeah. That command prompt is a problem.

While I’m on the topic, I should mention that the parser isn’t a picnic for authors either. It’s thanks to that pesky command prompt that so much development time goes into implementing feedback for completely stupid and inane actions; and who among us hasn’t left in a bad response to >TOUCH MOON or >RUB PARROT, just because the possible combinations of verbs and nouns in our game world was too enormous to think through properly?

These days there are some tools to help with that, and it’s often possible to tighten up the simulation in general rather than deal with every single annoying case individually — for instance, people making far-away things in Inform 7 could do worse than to check out Jon Ingold’s Far Away extension, which would let you designate the moon “far away” and then cope sensibly with all possible moon-fingering behavior.

Still, a huge amount of creative overhead goes into the not-always-thrilling task of creating responses to commands that aren’t sensible, aren’t relevant to advancing the game or story, give the player no interesting character notes about the viewpoint character, deliver no jokey zingers, and are flat-out unlikely to be typed by anyone who isn’t actively trying to break the game.

If you don’t work on that stuff, it makes the game look buggy and unpolished. Because it is buggy, because the player can make things happen that tear a giant hole in the illusion.

But I’d be lying if I said I’ve never wondered why I was bothering, sometime during hour seven of implementing responses to the highly-unlikely instead of working on stuff the player is definitely going to see.

So what then? Do we pitch out the parser and go to a system in which the player’s options are clearly enumerated at all times?

People have played with that idea too. Sometimes the approach is to fill a screen with lots of helps: a compass rose showing directions, a map, sometimes an image of the location, plus a menu or set of buttons representing all the major verbs available at any given moment.

This can get overwhelming. If I’m playing a largely text-oriented game, I prefer not to have the window of text forced into a small corner of the screen while the rest of the territory is taken up with interaction helps. It’s unattractive — and it also represents a style of UI that’s rapidly becoming obsolete. More and more games in the commercial sector streamline away as much as they can in order not to crowd the player’s field of vision with things other than the screen in which the action is happening.

The other way to go is CYOA: not creating lots of helps to deal with a complex interface, but narrowing the choices themselves down to a few narrowly-defined options.

When people talk about CYOA, they tend to mean a format in which the player is offered a handful of explicit options (“To go through the door on the left, turn to page 51. To go right, turn to page 75. To jump out the window, turn to page 9.”).

CYOA is also often associated with its early implementation in book form — Choose Your Own Adventure, the name from which the acronym is taken. Book CYOA either has to make the player keep notes (and trust him to be honest), or skip having any kind of world model or state other than what can be indicated by the page number that the reader is on. Some computer-based CYOA follows the same approach, providing an experience without stats or a model.

That makes for big problems with content generation and plot structure.

If all state is expressed by being on a given page, you either force the storyline to be all branches — every choice the player makes creates a new parallel universe, basically — or you have to rejoin the branches but have the choices before that rejoin be irrelevant forever after, because you can no longer tell which thing the player did. If you want to get a good sense for how that works, structural analyses of many classic CYOA books are online. The state problem explains why the choices in those books were so often arbitrary and unfair: going through the wrong door as more likely to lead to sudden death than to an interesting variation in the late story, because most of the time it was easier just to prune the branch than to continue it.

Structurally, this is no different from the most basic kinds of hypertext; literary hypertext often features a similar statelessness and lack of narrative arc, though it tends to handle that in terms of non-linear time or post-modern narrative structures. And even some literary hypertext tools go further, making it possible to (say) provide guarded links that you can only pass through after seeing other content.

I’m fairly comfortable saying that CYOA without a world model is too limited to express the range of things I’m interested in expressing in interactive storytelling.

Enumerated-choice games with world modeling do exist too. My (evidence-free) impression is that recently they’ve become a bigger part of the gaming scene than they had been for quite a while. (In fact, I got a request to review another such site during the course of writing this post: it’s Unknown Tales, and I can offer no insight on quality, because I’ve been busy writing this…)

Visual novels in Ren’Py are structured to include explicit choices, but it’s possible to set variables and keep statistical information around.

The all-text “Choice of…” series has been enjoying some popularity with the casual games crowd, using a model where the protagonist develops personality stats during play, but the game play is all about making (as their own manifesto indicates) “interesting choices”. (I think Chris Crawford would approve of that part, if not of the non-procedural nature of the rest of the game structure.)

Echo Bazaar flips that model: it’s heavily stats-based and requires the player to grind a lot — that is, to do repetitive actions in order to raise the stats — but offers a wide range of storylets that the player can choose to participate in (and some storylets themselves include multiple choices about what to do inside the story). Your chance of success at a given attempt depends on your stats, and the game retains information about major story threads that you’re part of. Echo Bazaar also features inventory items, money, and limited interaction with other players, making for something that feels a bit more gamelike than the “Choice of…” stories. The tradeoff is that it weakens the narrative arc substantially and forces players to do tedious actions some of the time.

Neither the “Choice of…” games nor Echo Bazaar quite achieves what I’d like to be able to do with interactive stories. EB is too slow and grinding, and I’d like a stronger narrative arc most of the time. The “Choice of…” games weren’t as immersive as I’d like, because there was no chance to engage with the settings or explore small-scale decision-making in the environments. But they’re both far, far closer to ideal than CYOA without a world model.

Maybe as a result of the popularity of these games, possibly in response to some more general zeitgeist, there’s a larger interest in enumerated-choice games within the IF community than I recall seeing for a while.

That’s not to say that there hasn’t been some CYOA following for quite a while. IFDB lists 43, and that’s probably not a complete list of works that were created with that kind of interface and presented to the IF community and/or implemented in an IF language or virtual machine. (It’s certainly not a complete collection of all computer-run CYOA ever, but the IF community tends not to hear of or pay attention to it to the same degree if it’s neither using IF tools nor submitted to an IF competition.)

Jon Ingold’s Adventure Book system provided for CYOA with a world model that tracked inventory and allowed for “magic words”, and there is now an Inform 7 extension that replicates and expands on its functionality.

For a few years the community also ran Lotech comp, a competition explicitly for CYOA-style games. Two entries were particularly interesting with respect to the current issue: Papillon’s One Week and Kingdom Without End by Shannon Cochran.

One Week is a resource-management game with a strong world model, similar to dating sims and virtual novels, in which the player’s choices determined how much time the protagonist spent earning money, making friends, and studying, with effects for the outcome of the story.

Kingdom Without End uses Adventure Book to create something that feels very similar to conventional parser-based IF — the options are to do things like move around and pick up objects, creating puzzles that feel like conventional IF puzzles, but with a significantly reduced range of possible actions.

One significant objection to the enumerated-command approach is the concern that it will eliminate puzzles. If the player is always choosing from a list of options, then where does the figuring out come in?

In practice, I don’t think that is as big a concern than some believe — we talked about this at the April Seattle meet. If the choices the player makes are fairly granular and tied to a predictably-behaving world model, then it’s possible to express many of the same things that a parser-based IF already has in it: making choices to set up a world state in which new choices will predictably become available.

A more significant problem for me is the narrowing of the command space. The average IF game has dozens of available verbs and hundreds of available nouns, and that means there’s lots and lots that’s possible to do. It’s not possible to enumerate that many commands in a way that doesn’t just look like a hideous list. And while I was earlier bemoaning the amount of work that goes into correcting for player input that’s simply useless and meaningless, I also absolutely don’t want to collapse the range of possible action down so far.

Having a lot of verbs is part of what distinguishes IF and provides for its narrative richness. The number of verbs you can perform in, say, Halo is tiny — and that has a direct effect on the kinds of stories you can tell with such a system. Console games especially tend to be very verb-focused in the way they express their affordances to the player — you’re always holding this controller with a relatively small number of buttons on it, and there are often indicators on the screen, or a training process, to help you remember which button means which action.

But a narrow verb set (here I go sounding like Chris Crawford again) means you’re forced to focus all your interactions around a fairly tight set of possibilities. Sometimes that results in a crisp, interactively focused short story experience, but there are a lot of stories for which it’s not so good.

IF is closer in this respect to Sims 3: lots of different objects (nouns) are to be found in the world, and they all have their own affordances. In IF, doors can be open, lamps lit; in Sims 3, the TV can be used for exercise videos or watching movies.

Another problem is that if I’m just clicking or typing numerical choices, I feel less involved in the action, for some reason. I don’t know why that is, but it’s definitely true. I’m fine with clicking on images and radial menus in games like Sims 3, but the interface is also graphical; there’s something appealing about communicating to the computer in the same idiom that the output is using.

So I personally am not ready to ditch the parser entirely, though I’m also interested in games that use enumerated choices over the top of a consistent world model — especially if those games manage to achieve my own aesthetic preferences of allowing some exploration and variability of pacing alongside the narratively important choices.

But I do think we need better ways to communicate the affordances of IF to players.

For some games, a graphical layout with menus or buttons for verbs — Dave Cornelson has very recently been proposing a variation on this — but in many cases I find that approach ugly, and it tends to take up an unpleasant amount of screen real-estate. I start to be uncomfortable when the readable text part of the screen is squeezed down into a tiny corner. Moreover, it’s hard to imagine how that kind of layout would transfer to a mobile device. (Edited to add: Dave points out that we don’t know how these systems would play out with a test market, and this is of course true — like everything I’m saying here, this is my own opinion and not meant to be proscriptive about what other people ought to try out or test. But for my own purposes, I’m not satisfied by this solution.)

Another approach is to retain visual cues but embed them in the main text itself. Bronze and Blue Lacuna both use the technique of highlighting (in bold or in color) important nouns in the text; Blue Lacuna goes on to let the player type just the noun names in order to perform the most obvious interaction with them (such as examining objects and passing through doors). Walker and Silhouette goes the extra step of being largely keyword-driven, though it does allow the player to type something other than the highlighted keywords if he wishes.

One could also provide help at the command line itself, for instance by offering autocompletion lists that would pop up a list of viable conclusions once you’d typed part of a command. One of the things that Ruben Ortega suggested at the Seattle IF meet was the idea of probabilistically generated auto-completion, taking into account the commands given by other players previously trying the same game. That seems like it might often have drawbacks, though — give away puzzle solutions, or else reveal that other players are strangely fond of typing obscenities.

Another approach is to provide suggestions but allow the player to type things besides the suggested command. That’s the approach that Bronze‘s tutorial mode takes; it’s also, in a slightly mutated form, what Jon Ingold’s Dead Cities does — together with providing a hyperlink to click to perform the next suggested action. The difference between the two (aside from the screen layout) is that Dead Cities‘ implementation offers something like a clickable walkthrough, making it possible for the player to minimize interaction and treat the game almost like a book (though I suspect it may be impossible to get to some of the most interesting content this way). (Ferrous Ring explicitly goes all the way with this, letting the user select a mode anywhere from full parser involvement to essentially watching the walkthrough go by.)

Yet another approach would be to offer something that looks like TADS 3/Alabaster-style conversation hints, only for all turns. Unfortunately, that disrupts the flow of the rest of the prose with a lot of mechanically generated content, and with options that will look a lot less diverse than conversation options generally do. I don’t imagine that the results of that would be acceptable.

I don’t have a complete solution to this problem, but here are some passing thoughts.

Hinting to the player about possible actions without listing all of the possibilities is an interesting tactic, but it’s been tried several times and hasn’t exactly set the world on fire yet.

On the other hand, it isn’t possible to list all the possibilities during a given turn of a typical IF game without making the interface incredibly ugly. Even listing all the verbs is unattractive — and, as I’ve tried to suggest, probably not the best way of organizing things, because the interesting possibility space in IF is better organized in terms of the nouns and the interactions they afford than by the verb set. (There are a few exceptions — LOOK, INVENTORY — but most such things could be treated as actions on the player or the room he’s in. Inform already internally translates an object-less LISTEN or SMELL command as LISTEN TO {the room I’m in} or SMELL {the room I’m in}.)

If we had a system where the player could select a noun, see what he could do with it, and select one of those options, that would (Sims-like) both clarify the possibility space for the player and eliminate the TOUCH MOON implementation problem.

Implementing that graphically presents some serious problems, and raises some accessibility issues as well. It’s one thing to click a picture of a television while playing Sims and get a radial menu. It’s considerably uglier to click on a noun in a page of text and get a radial menu of more text. And what happens if a noun is technically in scope, but no text referring to that noun currently appears on the screen? It forces the player to type LOOK and INVENTORY far more often than he’d otherwise have to, which is not good for gameplay or narrative flow. Finally, this is not something that current interpreters are designed to do (so it would require significant tool development), and it’s not clear to me how it could be made accessible for visually impaired users.

Implementing it at the command line seems more doable, maybe, though it has the disadvantage that typing, say, DOOR and getting a prompt with suggested door actions again clutters the transcript with ugliness.

More possible, I think: do like Jon Ingold’s Novel Mode did in My Angel, and move the command prompt down into its own separate window, which gets emptied and refreshed every turn. Keep the space above for game output text. Typing the name of a noun (or perhaps clicking it in the text, if hyperlinks are included) then brings up the list of relevant verbs down in that lower window, which means that at any given time you’re only seeing one such cluttery list, not the last half dozen that occurred during play.

Multi-window modes might be challenging for accessibility too, but it seems less dramatically hard than something that relies primarily on visual clues. One could also create an option to revert to a single-window system for those who wanted to use a screenreader on it in a more classic style.

Doing this well would require language support, because you don’t want the author to have to explicitly indicate every verb that applies to each noun. There would, I think, need to be some kind of “no defaults” system: in other words, the game would need to be able to work out whether a special response has been coded that’s relevant to this noun and verb. If so, the verb is available; if not, not.

Two-object verbs would be more complex, but I think not impossible; the player would pick a noun, get the verb list, and then be invited to specify the second noun.

One could further refine the system by ordering the verb list affiliated with each noun to present the most likely options first. Maybe also allow typing a verb and getting a list of nouns that it could reasonably work on, so that a practiced player who prefers old-style command entry could still choose to type OPEN DOOR instead of DOOR OPEN — it would just reject attempts to pick a noun (or verb) that wasn’t currently available.

This could be combined with text highlighting as well, so that the player would be able to see by looking at the screen which nouns were implemented as interactive — just as long as the player doesn’t have to be able to click on the noun in order to activate it, we avoid the accessibility problems and the need to type LOOK over and over in order to get access to items in the current room.

There’s a (very rough) prototype of what I mean here.

Something like this would permanently eliminate guess-the-verb, though at the cost of removing those rare occasions where we… well, want to make the player guess the verb, because the action is something cool but optional (like TWIRL MUSTACHE). Unquestionably something would be lost.

But think of it: no more TOUCH MOON. Lots less monkeying around with providing intelligent disambiguation and defaulting. The ability to smoothly prompt new-to-IF types of action, like BLACKMAIL or ANALYZE or ANTAGONIZE. The ability to back off from the strictly physical focus of IF by highlighting abstract nouns or ideas in the text as interactive instead. Easier avenues to fully systematic beta-testing. Many fewer pitfalls for novice authors. More scalable implementation, making really long games less onerous to take on.

I dunno, am I crazy?