Why can’t computers watch the Earth from above and automatically map our roads, buildings, and trash heaps? Satellite operator DigitalGlobe is teaming up with Amazon, the venture arm of the CIA, and chipmaker Nvidia to try to make it happen.

In a joint project, DigitalGlobe today released satellite imagery depicting the whole of Rio de Janeiro to a resolution of 50 centimeters. The outlines of 200,000 buildings inside the city's roughly 1,900 square kilometers have been manually marked on the photos. The SpaceNet data set, as it is called, is intended to spark efforts to train machine-learning algorithms to interpret high-resolution satellite photos by themselves.

DigitalGlobe says the SpaceNet data set should eventually include high-resolution images of half a million square kilometers of Earth, and that it will add annotations beyond just buildings. DigitalGlobe's data is much more detailed than publicly available satellite data such as NASA's, which typically has a resolution of tens of meters. Amazon will make the SpaceNet data available via its cloud computing service. Nvidia will provide tools to help machine-learning researchers train and test algorithms on the data, and CosmiQ Works, a division of the CIA’s venture arm In-Q-Tel focused on space, is also supporting the project.

Software will be trained to label buildings in satellite images using a data set of images like this one.

“We need to develop new algorithms for this data,” says Tony Frazier, a senior vice president at DigitalGlobe. The company operates four imaging satellites and provides data to U.S. intelligence agencies, humanitarian agencies, and other organizations that today mainly rely on humans to extract data from images.

Frazier says it should be possible to train software to do things like map the roads and buildings of shanty towns, track changes to urban infrastructure such as park benches and stop signs, and measure the materials used in roofs and other structures. That kind of information could be commercially valuable, and help inform health and aid programs, he says.

Mark Johnson, CEO of Descartes Labs, a startup that predicts crop yields from public satellite images, says the new data should be welcome to startups and researchers. Potential applications could include estimating economic output from activity in urban areas, or guiding city governments on how to improve services such as trash collections, he says.

SpaceNet is modeled on ImageNet, a collection of 1 million labeled photos that has underpinned image recognition research for years, including recent, huge jumps in its accuracy (see “The Revolutionary Technique That Quietly Changed Machine Vision Forever”). Companies such as Google and Facebook use image recognition technology built on ideas first tested against ImageNet.