the sun is a beautiful thing

in silence is drawn

between the trees

only the beginning of light

Was that poem written by an angsty middle schooler or an artificially intelligent algorithm? Is it easy to tell?

Yeah, it’s not easy for us, either. Or for poetry experts, for that matter.

A team of researchers from Microsoft and Kyoto University developed a poet AI good enough to fool online judges, according to a paper published Thursday on the preprint site arXiv. It’s the latest step towards artificial intelligence that can create believable, human-passing language, and, man, it seems like a big one.

In order to generate something as esoteric as a poem, the AI was fed thousands of images paired with human-written descriptions and poems. This taught the algorithm associations between images and text. It also learned the patterns of imagery, rhymes, and other language that might make up a believable poem, as well as how certain colors or images relate to emotions and metaphors.

Once the AI was trained, it was then given an image and tasked with writing a poem that was not only relevant to the picture but also, you know, read like a poem instead of algorithmic nonsense.

And to be fair, some of the results were pretty nonsensical, even beyond the sorts of nonsense you’d find in a college literary magazine.

this realm of rain

grey sky and cloud

it’s quite and peaceful

safe allowed

And, arguably, worse:

I am a coal-truck

by a broken heart

I have no sound

the sound of my heart

I am not

You could probably (we hope) pick those out of the crowd as machine-written. But while the AI is no Kendrick Lamar, many of the resulting poems actually did look like poems.

Next, the researchers had to see if the average person could tell the difference. That means: a Turing test of sorts.

The researchers found their judges on Amazon Mechanical Turk — an online service where people complete tasks that benefit from automation but still require human intelligence — and divided people up as either general users or “experts,” who had some sort of background in literary academia. These judges were then presented with poem after poem — sometimes with the associated picture, and sometimes without. They had to guess whether a human had written them, or whether AI had.

While the experts were better at identifying machine-written poems if they were given the image and general users were better without it, both groups were better at picking out the human-written poems than they were at identifying which ones were written by the new AI.

That is to say, the machines had them fooled more often than not.