Computers can now daydream, and what better place to start than a bedroom? Researchers from machine learning firm indico in Boston, Massachusetts and Facebook’s AI research lab in New York have developed an artificial neural network that can conjure up realistic-looking but imagined photographs on demand.

Neural networks work by analysing training data like photos and learning to recognise certain properties, from simple geometric shapes to complex figures like cats or faces.

Because the working of these systems is rather opaque, researchers are trying to understand exactly how different components of a network learn particular objects – Google’s psychedelic DeepDream landscapes are a recent example


Now, Alec Radford of indico and his colleagues have looked at a particular type of network called a generative adversarial network, in which one part of the system tries to invent fake data to fool another part into mistaking it for training data. The idea is that by repeatedly pitting the network against itself, it will learn to produce better images.

The team trained the system on a database of photos of bedrooms, then asked it to produce its own photos. To prove it wasn’t just copying the original data, they asked the network to generate a series of related pictures of the same scene, such as a bedroom with or without a window, or one in which a TV morphs into a window. This shows that the network has figure out how certain features fit into a bedroom scene, they say. “We can reason that the model has learned relevant and interesting representations.”

It also meant they could ask the network to generate images without particular features. The team drew boxes around windows in some of the training data, then told the network to ignore them. When this filter was active, it generated bedroom scenes in which windows were replaced with similar objects, like doors or mirrors.

The same principle can be applied to photos of other things, like faces. In another experiment, the team trained the network on pictures of faces, then tried out a kind of visual arithmetic. They gave the network pictures of smiling women, then told it to “subtract” pictures of women with neutral expressions and “add” men with neutral expressions. The goal was to extract the concept of “smiling” and combine it with the concept of “man”.

The results were entirely imagined pictures of smiling men. Trying the same thing just by adding or subtracting pixels in the images resulted in a blurred mess, showing that the network really was learning how to create its own photos.

For now the images are limited to just 32 x 32 pixels, which makes it simpler to crunch through all of the data and harder for humans to notice any errors in the images, but scaling up the system could lead to a kind of Google image search for pictures that don’t actually exist – you’d simply write a description and the computer would generate one for you. The team say they are also interested in applying the same model to video and audio.

Reference: arxiv http://arxiv.org/abs/1511.06380

Image credits: Alec Radford, Luke Metz, Soumith Chintala