Out of this world Stanford University and Intel

Take a look at the above image of a German street. At a glance it could be a blurry dashcam photo, or a snap that’s gone through one of those apps that turns photos into paintings.

But you won’t find this street anywhere on Google Maps. That’s because it was generated by an imaginative neural network, stitching together its memories of real streets it was trained on.

Nothing in the image actually exists, says Qifeng Chen at Stanford University, California, and Intel. Instead, his AI works from rough layouts that tell it what should be in each part of the image. The centre of the image might be labelled “road” while other sections are labelled “trees” or “cars” – it’s painting by numbers for an AI artist.


Chen says the technique could eventually create game worlds that truly resemble the real world. “Using deep learning to render video games could be the future,” he says. He has already experimented with using the algorithm to replace the game world in Grand Theft Auto V.

Realism is tricky

Noah Snavely at Cornell University, New York, is impressed. Generating realistic-looking artificial scenes is a tricky problem, he says, and even the best existing approaches can’t do it. Chen’s system creates the largest and most detailed examples of their kind he has seen.

Snavely says that the technology could allow people to describe a world, and then have an AI build it in virtual reality. “It’d be great if you could conjure up a photorealistic scene just by describing it aloud,” he says.

Chen’s system starts by processing a photo of a real street it hasn’t seen before, but that has been labelled so the AI knows which bits are supposed to be cars, people, roads and so on. The AI then uses this layout as a guide to generate a completely new image.

Read more: AI learns to write its own code by stealing from other programs

The AI was trained on 3000 images of German streets, so when it comes across part of the photo labelled “car” it draws on its existing knowledge to generate a car there in its own creation. “We want the network to memorise what it’s seen in the data,” Chen says.

Intel researchers will present the work at this year’s International Conference on Computer Vision, which takes place in Venice, Italy, in late October.

Dreamlike quality

The algorithm was also trained and tested on a smaller database of photos of domestic interiors, but Snavely says that to realise its potential it needs a data set that captures the true diversity of the world. That’s easier said than done, however, as each component in the training images needs to be labelled by hand, and creating a data set with that level of detail is extremely labour-intensive.

Chen says his system still has a long way to go before it can build truly photorealistic worlds. The images it produces right now have a blurry, dreamlike quality, as the network isn’t able to fill in all the details we expect in photos. He is already working on a larger version of the system that he hopes will be much more capable.

But when it comes to building worlds in virtual reality, that dreamlike nature might not be such a bad thing, says Snavely. We’re used to seeing super-slick and realistic worlds on film and in video games, but there’s not quite that level of expectation when it comes to VR. “You don’t need total photorealism,” he says.

Reference: arxiv.org/abs/1707.09405