May 12, 2009 8:00 AM |

['Homer in Silicon' is a biweekly GameSetWatch-exclusive column by Emily Short. It looks at storytelling and narrative in games of all flavors, including the casual, indie, and obscurely hobbyist. This week she looks at conversation design in support of narratively compelling, well-paced scenes.]

Last year, Brent Ellison published a Gamasutra article on dialogue systems in games. A little while later, Jonathan Blow posted on his blog an open-ended query about designing conversation in games. Then, this February, Krystian Majewski posted about dialogue choices in Emerald City Confidential, speculating about the challenges of presenting decision-filled conversation to casual players.

These three posts -- all interesting -- and the followup comments suggested that there's relatively little public discussion of core methodologies for conversation design.

In particular, much of the existing analysis conflates user interface (how are dialogue options presented? when do they appear on the screen? is the player offered full text of the next sentence, or a truncated version of what his character will say?) with the underlying model (how does the game decide which options are available when? how are the player's options restricted or made open? what controls when and whether the NPC speaks on his own?).

User interface is the more visible and thus the better understood of the two. It's easy to play a game and see how the dialogue choices are being presented, but much harder to guess what code lies beneath.

Yet the underlying model does matter a great deal.

In practice there is already an assortment of conversation models at work in commercial and indie games. The world of gaming dialogue is not quite divided into "Façade" and "everything else", and sometimes games that seem inexplicably more effective are so because they are relying on logic that has not received much public analysis.

If we're to understand how those things work, we need to start talking about conversation at a level beyond "what kind of menu does the player see?" -- because, as important as that question is, it only touches the surface.

There's a huge amount to look into here: how is conversation information stored? Is it merely dialogue text, or are abstract values associated with conversation content -- for instance, to indicate that one statement is more offensive than another? What is the relationship between what the PC says and the NPC's response? Does saying the same thing always garner the same reply? How is the idea of context handled in the code? What about repetition of dialogue?

What follows is a description of one model that I have found particularly useful and extensible. It has been refined in the course of writing for about a dozen conversation-centric projects since 2000, and also thanks to the influence of other conversation theorists, especially in the interactive fiction community.

It's absolutely not the only way I can imagine attacking the problem, but of the things I've tried, it is the most versatile.

Conversation Structure

Ellison identifies two fairly common kinds of dialogue design which he labels "Branching" and "Hub-and-Spokes". He writes:

"Though ultimately a variation of [Branching], Hub-and-Spokes Dialogue creates a very different conversation flow compared to basic Branching Dialogue. The player listens to the NPC's lines and then chooses their response from the main "hub" of the conversation... After hearing the NPC's response, the player either returns to the main hub, from which they can ask the same question again or inquire about another topic, or enters a deeper hub with more options to choose from."

As Ellison points out, this kind of conversation dynamic allows for a lot of player freedom to explore different topics, but can lead to repetition and give the impression of an implausibly patient NPC.

Both of these designs arise from a fundamentally tree-structured model of conversation. Let's call the lines of dialogue the player is allowed to speak, and their attached NPC responses, quips. We might picture a hub-and-spokes conversation that looks a bit like this:

Hub Quip about origin of werewolves -> automatic return to Hub Quip about wolfsbane -> automatic return to Hub Quip about Lord Fangclaw Quip about Fangclaw's bride Quip about making cake for the wedding -> automatic return to Hub Quip about bridal registry -> automatic return to Hub Return to Hub Quip about Fangclaw's castle Quip agreeing to storm the castle with pitchforks -> automatic return to Hub Quip refusing to storm the castle with pitchforks -> automatic return to Hub Return to Hub

Each quip in this dialogue has a place in the hierarchy. When we reach the end of a productive branch, we automatically go back to the hub. Perhaps some of the quips are only conditionally available -- we might not be able to ask about Fangclaw's bride until after we've seen the wedding invitations, for instance. Sometimes, too, we're forced to stick to a conversation thread until we've made a choice (as in the case where we either agree or refuse to storm the castle).

The advantage of a tree-based structure is that it's (relatively) easy to code, understand, and debug. The disadvantage is that it tends to a certain kind of design rigidity, and conversation flow is (as Ellison notes) seldom very realistic.

A range of other options open up if, instead, we regard quips not as branches of a tree but as atomic entities whose behavior is governed by prerequisite rules, and each of which is associated with one or more subjects. These prerequisites might be based on conversation state, the mood of the NPC, or details from the surrounding world model.

For instance:

Quip about origin of werewolves: subjects: Fangclaw prerequisite: none

Quip about wolfsbane: subjects: Fangclaw prerequisite: NPC trusts the player

Quip about Fangclaw's bride: subjects: marriage, Fangclaw prerequisite: player has seen the wedding invitation

Quip about making cake: subjects: marriage, Fangclaw prerequisite: immediately follows the quip about Fangclaw's bride

Quip about bridal registry: subjects: marriage, Fangclaw prerequisite: follows the quip about Fangclaw's bride, not necessarily immediately

Now we offer the player whichever quips are both relevant to the most recent subject(s) of conversation and currently available (according to their prerequisite rules). Saying a quip whose subjects are "marriage, fangclaw" might lead to other quips about marriage and other quips about Fangclaw.

Conversation can now flow naturally from one idea to other related ideas, just as it does in real life.

Restricting Choices in the Atomic Model

There will of course be times when the dialogue needs to be restricted a bit, as when we are going to ask the player to commit to one of a finite set of options (storm the castle or not?). It is still possible to enforce strict tree-like branching if we want it, by making a prerequisite that the quip immediately follows some other specific quip, and by placing special restrictions on what can follow the initial question quip:

Quip about Fangclaw's castle: subjects: Fangclaw prerequisite: none followups: only quips that immediately follow this one Quip agreeing to storm the castle with pitchforks: subjects: Fangclaw prerequisite: immediately follows quip about Fangclaw's castle Quip refusing to storm the castle with pitchforks: subjects: Fangclaw prerequisite: immediately follows quip about Fangclaw's castle

When the player asks about Fangclaw's castle, he will be forced to choose one of the two answers before again getting unrestricted access to other quips.

This still leaves the question of what the player will see when he first starts a conversation with a given NPC, or when he has exhausted all the conversation on a given subject so that no natural transitions remain.

There are several possible approaches to this. One is simply to show all the currently available quips; another is to have select quips marked as conversation-openers, so the player can choose from those; a third might be to invite the player to start by choosing a general subject of conversation and narrow his quip options that way.

The best design will depend very much on how many quips there are in total for a given NPC. A sparse design with few quips can afford to show the player most of his options at once, while a dense one will have to prune heavily.

Managing Knowledge and Repetition

Compared with hub-and-spoke dialogue, the use of atomic quips means that the structure of the dialogue becomes less repetitive -- once the player has made the quip about the bridal registry available, for instance, he doesn't necessarily have to re-ask the questions that led him to that conversation option the first time.

But we are still left with a choice of how to handle quips once the player has used them once:

- remove them from play so that they're never seen again (fine if they're decisions to make; not so good if they contain vital information the player might need to review, unless that information is getting recorded in a game journal)

- leave them in place to be revisited (stiff and unrealistic)

- provide alternate "repetition" versions of the dialogue so that on subsequent occasions when the player tries this quip, the NPC shows he knows he's repeating himself (doubles [or worse] the amount of writing and voice acting required)

- remove them, but offer a general consolidated quip on this subject, so that the player can say something like "remind me of what you told me about Fangclaw's wedding" and get a summary of the already-spoken quips on this topic (requires dynamic generation of the summary dialogue; may not be suitable for voice recording at all)

Again, the selection of the best approach will depend a great deal on what kind of work the game is and how the NPCs are intended to function. Easy recall of past conversation is most important in games where NPCs are primarily dispensers of information and quests; unrepetitive and realistic dialogue is most important where relationships and emotional states are the center of gameplay.

Modeling NPC Initiative

Part of what makes NPCs feel shallow and non-human is their lack of initiative.

Games partially overcome that by providing them with goal-seeking behavior, the ability to traverse a map intelligently, and cut-scenes in which they start new action.

Conversations, however, too often remain purely reactive, with the player asking questions and the NPC responding. Occasionally, to shake things up, an NPC interrogates the player character -- an improvement, perhaps, but one that simply reverses the dominance of the conversation.

Either way, a 1:1 ratio of comment and response makes for an exchange that feels a bit mechanical and lacks the dynamic richness of real conversation.

In real dialogue, people convey a great deal not only by what they say but by their use of conversational pragmatics. Do they insist on getting in the last word? Interrupt constantly? Come back over and over to the same tired topics? Avoid answering questions?

Well-observed characters in books and movies have these characteristics, but game conversation systems are rarely equipped to provide this type of characterization.

One way to enliven NPC behavior and supply better guidance to conversation scenes is to have the NPCs recognize when a given line of conversation has reached an end and suggest a new subject themselves. Perhaps the player has just spoken the last available quip on the topic of Lord Fangclaw's wedding. The NPC might answer, but then (before the player has a chance to select another option) change the subject to the castle. Occasionally, as a special effect, the NPC might even refuse to answer one question and deflect it with another.

This makes a much more natural transition than forcing the player to choose to go back to a conversation hub. It allows the conversation system to model different kinds of characteristic behavior. A pushy NPC might obsess or nag about a single topic. A flighty one might change the subject frequently. A reticent one might volunteer less information than others.

If we allow NPCs to volunteer information and change the subject, we must also decide how the model will choose new directions of discussion.

Some of my games have used very explicit queuing -- e.g., if the player talks about Lord Fangclaw's bride, the NPC will as soon as possible ask about the bridal registry. In others, the NPC has a list of topics he wants to hit before the end of the scene, and will choose the next unexplored one each time the conversation seems about to grind to a halt. In still others, the NPC's motives and interests develop over the course of a scene, depending on mood or narrative developments.

In fact, these methods can be combined to provide both macroscopic and microscopic control over conversation flow.

It may seem dangerously open-ended to step away from a rigorously pre-programmed tree of options. In practice, though, it reclaims a great deal of control for the author. The player has the freedom to drive the conversation and doesn't experience an NPC's script as quite so mechanical as in a tree system -- but the author is free to manage the pacing by minimizing dull re-navigation of the conversation tree, adjusting how proactive the NPC is, and offering longer or shorter sequences of related quips.

Once some of these pacing issues are solved, the resulting conversation feels much more like a coherent scene -- and the NPC feels less like a machine.

Modality

Discussions of Façade's conversation system often look at the interface (parsing of typed commands) and at the underlying drama management, but pay less attention to the game's lack of a separate conversation mode. Speech mingles freely with the other kinds of action available to the player, such as movement towards or away from other characters, embraces, and object manipulation.

This makes a subtle but critical difference in the way the game feels and the way the non-player characters inhabit their space. When all conversation takes place in its own interaction mode and is separate from the other action of the game (which stops for the duration), the NPCs come to seem as though they primarily inhabit some other plane of existence from the player character. Non-modal conversation, on the other hand -- however it is handled -- allows characters to react to all the player's behavior as though it in some way contributes to the dialogue.

This kind of interpenetration between game world and dialogue world is not always possible. Regular game play may simply be too different from the interface used for conversation. Making NPCs aware of what the player is doing to the surrounding environment may also require extra content-generation, and sometimes that burden may be too great.

When it works, though, this method does a great deal to elevate NPCs from the status of objects to the status of fellow-participants in the game world.

Not all of these suggestions will be appropriate for every game, and I've really only scratched the surface of the possibilities in conversation modeling.

Notice, for instance, the immense assumption I started with, that a quip is an atomic unit made up of PC speech and NPC response. It doesn't have to be so at all -- as Façade demonstrates -- but working with a model where the input and output are not strongly coupled introduces a lot of additional complexity, and the narrative and game-play advantages are not as immediately obvious.

Still -- these are things we should be talking about. The craft of dialogue writing for games is only partly about choosing the right words. It's also about choosing the right procedures.

[Emily Short is an interactive fiction author and part of the team behind Inform 7, a language for IF creation. She also maintains a blog on interactive fiction and related topics. She can be reached at emshort AT mindspring DOT com.]