This Article From Issue September-October 2018 Volume 106, Number 5 Page 317 DOI: 10.1511/2018.106.5.317 Unavailable For Purchase

View Issue

Recognizing an image used to be an overwhelming challenge for a computer. But since 2012, so-called deep learning algorithms have achieved impressive success rates at image recognition tasks, such as distinguishing images of cats and dogs. Such programs are also referred to as neural networks or artificial intelligence because of the methods used to achieve their results. Programmers do not feed the program the different characteristics of, say, cats and dogs, but rather show them lots of examples of each animal until the program can reliably tell the difference. What features the program uses to distinguish the creatures isn’t preprogrammed, and can even remain unclear to the programmer.

Images are a central element in astronomy as well. Telescopes capture photons from sources in outer space, and these photons are transformed into images or spectra, which are then analyzed. One of the major providers of images over the past 25 years has been the Hubble Space Telescope, which has delivered the most distant and best resolved images of galaxies to date. Astronomers wish to decode the information available in these images to unveil the formation history of the observed galaxies, and deep learning techniques recently have been adapted as tools for this purpose.

Ad Right

Astronomers classify galaxies by their shapes, or morphologies. Some are egg-shaped; others such as the Milky Way are almost flat disks. In the early universe, galaxies seem to start out as more “pickle shaped.” These shapes tell us relevant information about the formation history of the galaxies. Since the 1930s, morphologies have been determined by visual inspection, partly because no algorithm performed better than the human eye. But with the rapid increase in astronomical data that’s coming in, this already time-consuming task will become impossible by visual inspection alone. In 2015, we were able to show that deep learning algorithms could achieve unprecedented accuracy in determining the morphologies of distant galaxies observed with the Hubble Space Telescope, reaching an agreement with human-based classifications close to 95 percent. Such usage solves a 100-year-old problem, allowing astronomers to better keep up with the flood of galaxy data.

There is a fascinating future, and uncountable things to learn, by transferring artificial intelligence techniques to astronomy.

But we felt that the ability of deep-learning algorithms to automatically extract features and find correlations among images could be used even more powerfully than just in classifying galaxies. Right now, when visually observing galaxies, astronomers look for features that give clues about the underlying physics. But the data are often full of “noise,” making subtle signs of physical processes difficult to measure. And data and are typically multi-dimensional, meaning that, for instance, galaxies have to be observed in multiple wavebands. So we wanted to find out whether deep learning algorithms are able to capture subtle correlations in complex data and link those to the physics of galaxy evolution.

To test it out we used supercomputers to create simulations of galaxies that included all current knowledge of the physics of galaxy evolution. By using a simulation, we can ensure that we know the entire history of the galaxy being examined, whereas with observations of a real galaxy, we only have a snapshot of one moment in its lifetime. We provided the deep learning algorithms with single views of different stages of evolution of our simulated galaxies, and asked the algorithm if it could identify a given evolutionary stage.

By using a simulation, we can ensure that we know the entire history of the galaxy being examined.

We tested this idea with what we call “the blue nugget phase,” in which galaxies are particularly active in forming stars at their centers. We used 35 simulated galaxies and generated mock-observed images in the same format as those from the Hubble Space Telescope. We labeled every image according to its evolutionary stage from the simulation (before, during, or after the blue-nugget phase).

The deep neural networks were able to retrieve the galaxy phase with nearly 80 percent accuracy, even though it was very difficult for human observers to identify the phase just by looking at the images. This result implies that the neural networks were able to automatically find subtle traces of evolutionary phase in the data, not obvious to astronomers. Thus deep learning techniques are not only a faster way to do what we already knew how to do, but are also a powerful tool to help astronomers analyze the data and find hidden correlations.

Examples of galaxy evolution through three stages: "pickle-shaped" (green column), compaction phenomenon ("blue nugget" stage) with gas infall leading to central starburst (blue column), and post-"blue nugget" stage, often with star-forming disk. The top row comes from a high-resolution computer simulation, the middle row is the same set of images as in the top row but as would be observed by Hubble Space Telescope, and the bottom row are actual Hubble Space Telescope images of distant young galaxies classified into the three states using a deep learning algorithm.

Simulations by Daniel Ceverino and Joel Primack; simulated images by Greg Snyder and Marc Huertas-Company; Hubble Space Telescope observation CANDELS