How we decode 'noisy' language in daily life

Suppose you hear someone say, "The man gave the ice cream the child." Does that sentence seem plausible? Or do you assume it is missing a word? Such as: "The man gave the ice cream to the child." A new study by MIT researchers indicates that when we process language, we often make these kinds of mental edits. Moreover, it suggests that we seem to use specific strategies for making sense of confusing information -- the "noise" interfering with the signal conveyed in language, as researchers think of it.

"Even at the sentence level of language, there is a potential loss of information over a noisy channel," says Edward Gibson, a professor in MIT's Department of Brain and Cognitive Sciences (BCS) and Department of Linguistics and Philosophy.

Gibson and two co-authors detail the strategies at work in a new paper, "Rational integration of noisy evidence and prior semantic expectations in sentence interpretation," published today in the Proceedings of the National Academy of Sciences.

"As people are perceiving language in everyday life, they're proofreading, or proof-hearing, what they're getting," says Leon Bergen, a PhD student in BCS and a co-author of the study. "What we're getting is quantitative evidence about how exactly people are doing this proofreading. It's a well-calibrated process."

Asymmetrical strategies

The paper is based on a series of experiments the researchers conducted, using the Amazon Mechanical Turk survey system, in which subjects were presented with a series of sentences -- some evidently sensible, and others less so -- and asked to judge what those sentences meant.

A key finding is that given a sentence with only one apparent problem, people are more likely to think something is amiss than when presented with a sentence where two edits may be needed. In the latter case, people seem to assume instead that the sentence is not more thoroughly flawed, but has an alternate meaning entirely.

"The more deletions and the more insertions you make, the less likely it will be you infer that they meant something else," Gibson says. When readers have to make one such change to a sentence, as in the ice cream example above, they think the original version was correct about 50 percent of the time. But when people have to make two changes, they think the sentence is correct even more often, about 97 percent of the time.

Thus the sentence, "Onto the cat jumped a table," which might seem to make no sense, can be made plausible with two changes -- one deletion and one insertion -- so that it reads, "The cat jumped onto a table." And yet, almost all the time, people will not infer that those changes are needed, and assume the literal, surreal meaning is the one intended.

This finding interacts with another one from the study, that there is a systematic asymmetry between insertions and deletions on the part of listeners.

"People are much more likely to infer an alternative meaning based on a possible deletion than on a possible insertion," Gibson says.

Suppose you hear or read a sentence that says, "The businessman benefitted the tax law." Most people, it seems, will assume that sentence has a word missing from it -- "from," in this case -- and fix the sentence so that it now reads, "The businessman benefitted from the tax law." But people will less often think sentences containing an extra word, such as "The tax law benefitted from the businessman," are incorrect, implausible as they may seem.

Another strategy people use, the researchers found, is that when presented with an increasing proportion of seemingly nonsensical sentences, they actually infer lower amounts of "noise" in the language. That means people adapt when processing language: If every sentence in a longer sequence seems silly, people are reluctant to think all the statements must be wrong, and hunt for a meaning in those sentences. By contrast, they perceive greater amounts of noise when only the occasional sentence seems obviously wrong, because the mistakes so clearly stand out.

"People seem to be taking into account statistical information about the input that they're receiving to figure out what kinds of mistakes are most likely in different environments," Bergen says.

Reverse-engineering the message

Other scholars say the work helps illuminate the strategies people may use when they interpret language.

"I'm excited about the paper," says Roger Levy, a professor of linguistics at the University of California at San Diego who has done his own studies in the area of noise and language.

According to Levy, the paper posits "an elegant set of principles" explaining how humans edit the language they receive. "People are trying to reverse-engineer what the message is, to make sense of what they've heard or read," Levy says.

"Our sentence-comprehension mechanism is always involved in error correction, and most of the time we don't even notice it," he adds. "Otherwise, we wouldn't be able to operate effectively in the world. We'd get messed up every time anybody makes a mistake."

The study was supported by a grant from the National Science Foundation.