Personified user interfaces, like chat bots or agents, are the new thing once again. But despite advances in artificial intelligence, they still have many issues and drawbacks compared to direct-manipulation interfaces. There was a debate around these interfaces in the 1990s, and it seems to be bound to repeat itself.

In the last few weeks, Facebook unveiled a new push to use chat bots on Messenger, and Microsoft has a new platform for building bots. These bots are supposed to be the new way of doing everything, from delivering news (like Quartz’ recent iPhone app) to letting you order pizza, flowers, and everything else.

These bots act like people, in that they talk to you via text chat and try to understand free-form text. They don’t attempt to pass the Turing Test, but they promise to be smarter (and more attentive) than an overworked call center worker tending to a dozen people at the same time.

Bots’a Problems

So what’s not to like? Bots have a number of problems that have been solved, or that are at least understood well, in graphical user interfaces.

Bots are not discoverable. It is very difficult to find out what a bot will respond to, and takes a lot of trial and error. Making it easy to find things is the first task in designing user interfaces (GUIs). Even if you can’t tell what something does in a GUI, you can see that it’s there and can try it. You don’t know what sorts of constructs a particular bot will understand.

Bots hide information. A graphical user interface lets you browse. You don’t have to specify exactly what you want, you can look at pictures or a list and figure it out as you go through it. A bot can list things, but it can’t easily present an endlessly-scrolling list with lots of items. Getting a long list from a bot is about as useful as having a call center agent list your options on the phone. By the time they’re done reading them all out, you’ve forgotten half of them.

Bots require syntax. You need to specify your operations in a sort of syntax. That can either be a single sentence (if you know how to construct that so it’ll be understood), or a back-and-forth between you and the bot where it asks you for missing information. In graphical user interfaces, you can drag and drop, select from lists, swipe, shake, and use lots of other gestures.

Bots are never as smart as you think. We tend to assume that because we can talk to a thing, it will understand and have all sorts of external knowledge that it can’t possibly have. That is not an issue with graphical UIs. Assuming the bot will understand and then having it come back with what is basically Huh? is frustrating, especially because it’s often hard to impossible to figure out what you could have done instead.

The Shneiderman/Maes Debate of 1997

The teaser image above is the opener of an article from the November/December 1997 issue of ACM’s interactions magazine. It summarizes the debate between Ben Shneiderman and AI researcher Patti Maes, which they had at two conferences that year. Ben had been going after agents for a while before then it seems, including a 1992 panel he was on with the great title, Anthropomorphism: From ELIZA to Terminator 2.

There are a number of points in that article, not all of which apply to chat bots. But the gist is that Shneiderman wants people to see as much data as possible and be able to touch it directly (using direct manipulation), whereas Maes argues that for large amounts of information, you need agents to sort through it all.

The problem is that despite Maes’ insistence that agents learn and understand what the user wants, they are still extremely dumb. All I need to do to see the state of the art is to look at my Amazon recommendations: Oh, you bought a battery recently? You must be into batteries! Here, look at all these other batteries we have! The same is true of things that you don’t typically buy very often, like cameras. Look at all these other cameras! This one would go great with the one you just bought! Also this completely incompatible lens!

Maes also closes by saying that agents are not an alternative to direct manipulation, but require it as their own user interfaces. That may be the case for some of them, but especially with chat bots, the bot is there instead of the graphical interface.

There is also a strangely elitist vibe to these bots that bothers me tremendously. The people creating these things assume that we like to think of computers as our servants. They ask what we wish them to do, how we’re doing, etc. I don’t want a servant, and I don’t care if it’s a person or a machine. I don’t want to have to go through some real or pretend intelligence when performing a task that could just as easily be done with a few mouse clicks. That whole notion of a personal assistant just rubs me the wrong way.

It also takes control out of my hands. Instead of being able to just perform a task, I have to go through an intermediary. Why? All the uses I’ve seen so far could have just as easily (and much better) be done with a simple app or website. The indirection a bot creates interferes with my sense of control and involvement. I’m no longer the active party, I’m just yelling instructions. That seems like an odd way of interacting with a machine.

Bot Niches

Chat bots are a reasonable solution when all you have to talk to a service is a text connection, which is why they are marketed as a great way to reach people in the developing world who have cell phones and can text, but whose connections are slow and unreliable. But with a reasonably good connection and screen, it’s much more comfortable to be able to use a well-designed user interface than talk to a bot.

I can also see bots being useful for people with vision or other deficiencies, since they can help guide them when they can’t easily see the interface or interact with it.

Outside of these niches, I really don’t understand the appeal of bots. The way they act and force me to interact, and the way they restrict my access to information, in no way make up for whatever advantages they might have. More than that, they strike me as disempowering the user and thus taking many steps back from modern graphical user interfaces that literally let you touch the actions and data you’re dealing with.