The problem is that to be able to train an AI to make these sort of inferences, researchers have to input tremendous amounts of carefully labeled data. And neural networks often have trouble applying the lessons learned from one scene to another. The key, then, was creating a neural network that could understand its surroundings.

Enter the General Query Network (GQN) from DeepMind. This neural network differs from others because it is programmed to observe its surroundings and train only on that data -- not on data inputted by researchers. As a result, GQN is learning to make sense of the world and applying those observations to new scenes it encounters.

After exposing the GQN to controlled environments, researchers exposed it to randomly generated ones. It was able to imagine the scene from different angles and create a three-dimensional rendering of a 2D image. It also was able to identify and classify objects without pre-inputted labels on what they were as well as make inferences based on what it can see to figure out what it can't.

Findings were published in the journal Science, but you can read the full PDF here. The researchers note that there are, of course, some limitations to GQN. So far, it's only been trained on synthetic scenes; it's unclear how it would do with real-world images. The blog post notes, "While there is still much more research to be done before our approach is ready to be deployed in practice, we believe this work is a sizeable step towards fully autonomous scene understanding."