Next, He staged a pun contest, pitting the AI against (human) humorists. According to the crowdworkers who rated the puns, the results were … not great for the machines, at least by human standards. While He’s system produced puns that were much funnier than a previous AI-driven attempt, it only beat the humans 10 percent of the time. Plus, the puns were stuck in a rather rudimentary structure (and struggled at times with grammar). Some examples:

That’s because negotiator got my car back to me in one peace.

Even from the outside, I could tell that he’d already lost some wait.

Well, gourmet did it, he thought, it’d butter be right.

“We’re nowhere near solving this,” He says.

Still, Roger Levy, director of MIT’s computational psycholinguistics lab, says the approach is a promising step toward building AI with a bit more personality. “Humor is an intrinsically challenging aspect of studying the mind. But it’s also fundamental to what makes us human,” he says. Four years ago, Levy described a computational approach to predicting whether a pun is funny---work that would eventually become the foundation for He’s joke-generation method. Levy says he had planned on testing something like the local-global surprisal principle, which is more fine-tuned than the theories used in his paper. The concept made sense, intuitively, but he didn’t yet have the data to prove it. “It’s really cool to see that actually pan out,” he says.

LEARN MORE The WIRED Guide to Artificial Intelligence

More broadly, the humor research highlights the need to bring more human intelligence to neural nets, Levy says. Recently, he’s been using surprise as a way to study other aspects of how AI understands language. “Surprisal is one of the most central concepts in both AI and cognitive science,” Levy says. In humans, it reflects when we encounter new or unexpected information, and can be measured with a proxy, like tracking eye movements as we read. In machines, it’s measured with probabilities---a word that’s lower probability in a given context is more surprising.

That makes surprise a handy way of comparing how human brains and machines reason their way through language---a way of probing the inner workings of our respective black boxes. By submitting neural networks to a set of psycholinguistic tests intended to study how humans handle ambiguous language, Levy found he could begin to see where the machines were unexpectedly set off-kilter or blew past challenges in un-humanlike ways. Adjusting for those differences, he says, could be the key to designing AI with more humanlike behavior.

In the meantime, He says she hopes to apply her general pun approach to more difficult creative tasks, like storytelling. The idea, she says, is to let the neural network do what it’s good at and then edit the result with human intelligence. A neural network could be trained to generate a dull string of perfectly coherent sentences, for example, and then learn to edit that output into a creative short story based on theories of narrative. “The goal is to make stories that are more creative and interesting,” He says. “I want AI to write stories about things humans wouldn’t think to write about.”

More Great WIRED Stories