This story has been updated with additional information and corrections provided by Google after the interview.

In May, Google unveiled Earth Engine, a set of technologies and services that combine Google's existing global mapping capabilities with decades of historical satellite data from both NASA and the US Geological Survey (USGS). One of the first products emerging from Earth Engine is Timelapse—a Web-based view of changes on the Earth's surface over the past three decades, published in collaboration with Time magazine.

The "Global Timelapse" images are also viewable through the Earth Engine site, which allows you to pan and zoom to any location on the planet and watch 30 years of change, thanks to 66 million streaming video tiles. The result is "an incontrovertible description of what's happened on our planet due to urban growth, climate change, et cetera," said Google Vice President of Research and Special Initiatives Alfred Spector.

But that's just the surface of what Google has created with Earth Engine. In an exclusive interview with Ars Technica, Spector and Google Visiting Scientist Randy Sargent drilled down on how Google is using software developed by Sargent's team at Carnegie Mellon University's CREATE Lab to generate what amounts to an animated 52 terapixel time-lapse portrait of the planet (51.6 terapixels total over 29 years, or 1.78 terapixels per year). Here's how the company did it.

The big picture

"We began to realize a few years back that Google Maps could be augmented to support all sorts of data," Specter said. "We had this idea that we could extend it to support multispectral (imagery) data. And given that we were getting feeds over time, we could store them as time-based sets, so you could go back and forth in time to look at changes. That idea became the Earth Engine project."

Over 40 years of NASA satellite data has been "ingested into Earth engine," said Sargent. "That's been married to Google's compute infrastructure, so you can detect deforestation or find land use changes."

Sargent and the Earth Engine team used 909 terabytes of data from the Landsat 4, 5, and 7 satellites—with each of the million images weighing in at more than 100 megapixels.

Landsat's polar orbit allows each satellite to take a full set of images of the Earth's surface every 16 days. But not all of those images are keepers due to weather and other factors. "It's not as easy as just lining up the pixels," Sargent said. "Most of the challenges involved dealing with the atmosphere—if it's cloudy, you're not seeing anything. And if it's hazy, you have to look through it. So we had to build mosaics that excluded cloudy images and then correct for haze."

To do that, Google used 20 terabytes of data from MODIS (MODerate resolution Imaging Spectroradiometer) sensors on NASA's Earth Observing System Terra and Aqua satellites. "MODIS captures the entire planet daily," said Sargent. "It has enough different spectral bands (ultraviolet through infrared) that it helps us analyze what it sees in the atmosphere." Using MODIS' MCD43A4 data (which provides information on ground and atmospheric reflectance), the Earth Engine team built a cloud-free, low-resolution model of the Earth for each year for which data was available.

That data was used to create statistical estimates for the color of each pixel of Landsat coverage and to correct for seasonal variance in vegetation, haze, and cloud cover.

Early years of the dataset had gaps due to the 1987 failure of Landsat 5's Ku-band transmitter—which prevented the downlink of imagery collected outside the range of US and cooperating international ground stations. This meant large chunks of Asia (particularly in China) were not covered by Landsat's archives until 1999. So to get a complete picture for each year, the data was interpolated between years where images were available.

Processing all of the data to produce the final mosaics representing each of the 29 years covered—from 1984 to 2012—took under a day, using 260,000 core-hours of CPU time in Google's compute cloud.

Serving up the time machine

With 29 world-spanning mosaics mapped to Google's model of the Earth, the next step was to make the images explorable both in space and time. To achieve this, Sargent's Carnegie Mellon research team extended the open source GigaPan Time Machine software developed by Carnegie Mellon's CREATE Lab.

Time Machine ingests very high-resolution videos and converts them into multiple overlapping multi-resolution video tiles delivered as a stream, using a manipulation of HTML5's video tag in a way similar to how Google uses HTML image tags to pan and zoom in Google Maps.

Previous Time Machine projects had handled videos with billions of pixels of resolution. But Time-Lapse Earth pushed the envelope for Time Machine because of the size of the data. The 30-meter-per-pixel video was generated from 29 Mercator-projected mosaics created by Earth Engine, and each frame had 1.78 trillion pixels.

In order to generate the millions of overlapping videos required and integrate them into Earth Engine's geospatial search capabilities, CMU researchers had to connect Time Machine into Earth Engine and Google's computing and storage infrastructure. Just encoding the videos for Global Timelapse consumed 1.4 million core-hours of compute time. The total process of creating the time-lapse application took 3 days of total processing time and 1.8 million core-hours, and at its peak it used 66,000 cores simultaneously in Google's cloud.

In order to seamlessly present the final product through a Web browser as users zoom and pan through it, Time Machine created what Sargent called "a tree of tiles." Each individual video that represents a "viewport" in the Global Timelapse data is a video file, indexed in a treed table of contents by Earth Engine.

"The client makes a request for the fragment of the video for the location a user is looking at, which includes the table of contents," said Sargent. "As you're switching resolution levels or locations, you have one set of video information on screen, and on the back end we're cueing up other videos ahead of time to anticipate where you're going to look next."

The video frames extend beyond the boundary of the "viewport" given to the user so that the system has some slack in responding to a user panning around a location.

While the Global Timelapse is a powerful educational tool showing the impact we have on our environment, Google's Earth Engine is also aimed at being a platform to provide researchers and policymakers worldwide with satellite data and other data sets they may not have had the resources to use before. "There are a number of partners who are currently using Earth Engine," Spector said, "primarily earth scientists." Google is also seeking other sources of geospatial data sets to add to Earth Engine to extend its usefulness.

Google is still exploring the potential applications of the underlying data. There's also an Earth Engine API currently available in limited release; Google is currently seeking researchers to make use of the API.