On August 18, 2017, a new age in astronomy dawned, appropriately, with a tweet: “New LIGO. Source with optical counterpart. Blow your sox off!” One astronomer had jumped the gun, tweeting ahead of an official announcement by LIGO (the Laser Interferometer Gravitational-Wave Observatory). The observatory had detected an outburst of gravitational waves, or ripples in spacetime, and an orbiting gamma-ray telescope had simultaneously seen electromagnetic radiation emanating from the same region of space. The observations—which were traced back to a colliding pair of neutron stars 130 million light-years away—marked a pivotal moment for multimessenger astronomy, in which celestial events are studied using a wide range of wildly different telescopes and detectors.

The promise of multimessenger astronomy is immense: by observing not only in light but also in gravitational waves and elusive particles called neutrinos, all at once, researchers can gain unprecedented views of the inner workings of exploding stars, galactic nuclei and other exotic phenomena. But the challenges are great, too: as observatories get bigger and more sensitive and monitor ever larger volumes of space, multimessenger astronomy could drown in a deluge of data, making it harder for telescopes to respond in real time to unfolding astrophysical events. So astronomers are turning to machine learning—the type of technology that led to AlphaGo, the first machine to beat a professional human Go player.

Machine learning could boost multimessenger astronomy by automating crucial early phases of discovery, winnowing potential signals from torrents of noise-filled data so that astronomers can focus on the most tantalizing targets. But this technique promises more. Astrophysicists are also trying it out to weigh galaxy clusters and to create high-resolution simulations needed to study cosmic evolution. And despite concerns about just how machine-learning algorithms work, the stupendous improvements they offer for speed and efficiency are unquestionable.

“This is like a tsunami,” says Eliu Huerta, an astrophysicist and artificial intelligence researcher at the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign. “People realize that for the big data we have [coming] in the future, we can no longer rely on what we have been doing in the past.”

The Gravitational-Wave Hotline

The past, in the case of LIGO and its European counterpart Virgo, does not extend back very far. It was only in February 2016 that those observatories announced the first ever detection of gravitational waves, produced by the merger of two black holes. Now in their third observing run, which began in April, advanced versions of LIGO and Virgo have begun sending out public alerts about new potential gravitational-wave sources as they are detected, all to better support multimessenger observations.

This practice may seem routine, but it belies the enormous effort required for each and every detection. For example, the signals being collected by LIGO must be matched by supercomputers against hundreds of thousands of templates of possible gravitational-wave signatures. Promising signals trigger an internal alert; those that survive additional scrutiny trigger a public alert so that the global astronomy community can look for electromagnetic and neutrino counterparts.

Template matching is so computationally intensive that, for gravitational waves produced by mergers, astronomers use only four attributes of the colliding cosmic objects (the masses of both and the magnitudes of their spins) to make detections in real time. From there, LIGO scientists spend hours, days or even weeks performing more processing offline to further refine the understanding of a signal’s sources, a task called parameter estimation.

Seeking ways to make that labyrinthine process faster and more computationally efficient, in work published in 2018, Huerta and his research group at NCSA turned to machine learning. Specifically, Huerta and his then graduate student Daniel George pioneered the use of so-called convolutional neural networks (CNNs), which are a type of deep-learning algorithm, to detect and decipher gravitational-wave signals in real time. Deep-learning algorithms use networks made of layers. Each layer is composed of nodes modeled on the activity of neurons in the human brain. Roughly speaking, training or teaching a deep-learning system involves feeding it data that are already categorized—say, images of galaxies obscured by lots of noise—and getting the network to identify the patterns in the data correctly. The training data set can involve tens of thousands, if not millions, of instances of previously classified data. The network learns by tuning the connections between its neuronlike nodes such that it can eventually make sense of uncategorized data.

After their initial success with CNNs, Huerta and George, along with Huerta’s graduate student Hongyu Shen, scaled up this effort, designing deep-learning algorithms that were trained on supercomputers using millions of simulated signatures of gravitational waves mixed in with noise derived from previous observing runs of Advanced LIGO—an upgrade to LIGO completed in 2015. These neural networks learned to find signals embedded in Advanced LIGO noise.

There are crucial differences between this approach and LIGO’s standard methods. Most importantly, deep-learning algorithms can do both detection and parameter estimation in real time. Additionally, they can easily handle more parameters on the fly than the four that LIGO currently manages. For instance, Adam Rebei, a high school student in Huerta’s group, showed in a recent study that deep learning can identify the complex gravitational-wave signals produced by the merger of black holes in eccentric orbits—something LIGO’s traditional algorithms cannot do in real time. “For each black hole merger signal that LIGO has detected that has been reported in publications, we can reconstruct all these parameters in two milliseconds,” Huerta says. In contrast, the traditional algorithms can take days to accomplish the same task.

Because of its ability to search over a larger set of parameters, a deep-learning system can potentially spot signatures that LIGO might otherwise miss. And while training requires supercomputers, once trained, the neural network is slim and supple, with an extremely small computational footprint. “You can put it on the phone and process LIGO data in real time,” Huerta says.

Artificial Eyes on the Sky

Huerta is now working with Erik Katsavounidis, a member of the LIGO collaboration at the Massachusetts Institute of Technology, to test deep-learning algorithms in the real world. “The goal is to have some of these algorithms deployed throughout the third and fourth observing runs of the LIGO and Virgo detectors,” Huerta says. “It’ll be a good social experiment to see how we react to, for example, neural nets finding complex signals that are not observed by other algorithms.”

If successful, such a deep-learning system will be highly efficient at generating alerts for other telescopes. The most ambitious of these telescopes, still under construction atop Cerro Pachón in Chile, is the Large Synoptic Survey Telescope (LSST). When complete, the 8.4-meter LSST will be able to observe 10 square degrees of the sky at once (equivalent in size to 40 full moons), producing 15 to 20 terabytes of raw data each night—the same amount of data generated by the Sloan Digital Sky Survey over the course of a decade. Within that massive trove, LSST’s astronomers will seek out supernovae, colliding stars, and other transient or variable phenomena—sources that momentarily brighten in the electromagnetic spectrum and then fade away over hours, days or weeks. The scientific value of any given transient is typically proportional to how rapidly and thoroughly follow-up observations occur.

“We have to be able to sort through a million to 10 million alerts of places in the sky changing every night and decide in, effectively, real time what is worth using precious resources for follow-up,” says Joshua Bloom, an astrophysicist and machine-learning expert at the University of California, Berkeley. “Machine learning will have a massive role in that.” Such approaches are already paying dividends for precursors to LSST, including the Zwicky Transient Facility (ZTF), which uses a camera with a field of view of 47 square degrees, installed on a 1.2-meter telescope at the Palomar Observatory in California. In a preprint paper, Dmitry Duev of the California Institute of Technology and his colleagues recently reported that a system called DeepStreaks is already helping astronomers track asteroids and other fast-moving near-Earth objects. “We can improve the efficiency of detecting streaking asteroids by a couple orders of magnitude,” the researchers wrote.

Similar techniques can be used to search for other transient sources in ZTF data. “Machine learning is incredibly important for the success of the project,” Bloom says. The other important component of multimessenger astronomy is the detection of neutrinos, which are emitted alongside electromagnetic radiation from astrophysical objects such as blazars. (Blazars are quasars—highly luminous objects powered by giant black holes at the centers of distant galaxies—whose jets of high-energy particles and radiation are pointed toward Earth.)

On September 22, 2017, IceCube, a neutrino detector comprising 5,160 sensors embedded within one cubic kilometer of ice below the South Pole, detected neutrinos from a blazar. The sensors look for streaks of light made by particles called muons, which are created when neutrinos hit the ice. But the handful of neutrino-generated muons can be outnumbered by the millions of muons created by cosmic rays encountering Earth’s atmosphere. IceCube has to essentially sift through this morass of muon tracks to identify those from neutrinos—a task tailor-made for machine learning.

In a preprint paper last September, Nicholas Choma of New York University and his colleagues reported the development of a special type of deep-learning algorithm called a graph neural network, whose connections and architecture take advantage of the spatial geometry of the sensors in the ice and the fact that only a few sensors see the light from any given muon track. Using simulated data, which had a mix of background noise and signals, the researchers showed that their network detected more than six times as many events as the non-machine-learning approach currently being used by IceCube.

Huerta is impressed by these achievements. “If we are developing or constructing these next-generation instruments to study the universe in high fidelity, we also better design better algorithms to process this data,” he says.

Einstein in a Box

As important as these advances are, deep-learning algorithms come with a major concern. They are essentially “black boxes,” with the specifics of their operations obscured by the interconnectivity of their layered components and the thousands to millions of tunable parameters required to make them function. In short, even experts looking in from the outside are hard-pressed to understand exactly how any given deep-learning algorithm arrives at a decision. “That’s almost antithetical to the way that physicists like to think about the world, which is that there are—and there ought to be—very simple mathematical functions that describe the way that the world works,” Bloom says.

To get a handle on interpreting what machine-learning algorithms are doing, Michelle Ntampaka, an astrophysicist and machine-learning researcher at Harvard University and her colleagues developed a CNN to analyze x-ray images of galaxy clusters. They trained and tested the network using 7,896 simulated x-ray images of 329 massive clusters, designed to resemble those generated by the Chandra X-ray Observatory. The CNN became just as good as traditional techniques at inferring the mass of a cluster. But how was that neural net doing its job?

To find out, Ntampaka and her team used a technique pioneered by Google’s DeepDream project, which enables humans to visualize what a deep-learning network is seeing. Ntampaka’s team found that the CNN had learned to ignore photons coming from the core of the clusters and was paying more attention to photons from their periphery to make its predictions. Astronomers had empirically arrived at this exact solution about a decade earlier. “It’s exciting that it learned to excise the cores, because this is evidence that we can now use these neural networks to point back to the underlying physics,” Ntampaka says.

For Ntampaka, these results suggest that machine-learning systems are not entirely immune to interpretation. “It’s a misunderstanding within the community that they only can be black boxes,” she says. “I think interpretability is on the horizon. It’s coming. We are starting to be able to do it now.” But she also acknowledges that had her team not already known the underlying physics connecting the x-ray emissions from galaxy clusters to their mass, it might not have figured out that the neural network was excising the cores from its analysis.

The question of interpretability has come to the fore in work by astrophysicist Shirley Ho of the Flatiron Institute in New York City and her colleagues. The researchers built a deep-learning algorithm, which they call the Deep Density Displacement Model, or D3M (pronounced “dee cube em”), to efficiently create high-resolution simulations of our universe. When telescopes collect data about the large-scale structure of the universe, those data are compared against our best simulations, which are themselves based on theories such as general relativity and quantum mechanics. The best matches can help cosmologists understand the physics governing the evolution of the universe. High-resolution simulations are extremely expensive, however—they can take millions or tens of millions of hours of computing time to run. So cosmologists often resort to speedier low-resolution simulations, which make simplifying assumptions but are less accurate.

Ho and her colleagues first generated 10,000 pairs of simulations, each pair consisting of a low-resolution, or low-res, simulation of the evolution of a volume of space containing about 32,000 galaxies and a high-resolution, or high-res, simulation of the same volume. They then trained D3M one pair at a time, giving it a low-res simulation—which takes only milliseconds to generate—as an input and making it output the high-res counterpart. Once D3M had learned to do so, it produced each high-res simulation for any given low-res simulation in about 30 milliseconds. These simulations were as accurate as those created using standard techniques, which require many orders of magnitude more time.

The staggering speedup aside, the neural net seemed to have gained a deeper understanding of the data than expected. The training data set was generated using only one set of values for six cosmological parameters (such as the amount of dark matter thought to exist in the universe). But when D3M was given a low-res simulation of a universe with an amount of dark matter that was significantly different than what physicists think is present in our universe, it correctly produced a high-res simulation with the new dark matter content, despite never being explicitly taught to do so.

Ho and her team are somewhat at a loss to explain exactly why D3M is successful at extrapolating high-res simulations from low-res ones despite the differing amounts of dark matter. Changing the amount of dark matter changes the forces influencing galaxies, and yet the algorithm works. Maybe, Ho says, it figured out the extrapolation for changes in only one parameter and could fail if multiple parameters are changed at once. Her team is currently testing this hypothesis.

The other “real grand possibility,” Ho adds, is that D3M has stumbled on a deeper understanding of the laws of physics. “Maybe the universe is really simple, and, like humans, the deep-learning algorithm has figured out the physical rules,” she says. “That’s like saying D3M has figured out general relativity without being Einstein. That could be one interpretation. I cannot tell you what is correct. At this point, it’s nearly philosophical until we have more proof.”

Meanwhile the team is working hard to get D3M to fail when extrapolating to different values of the cosmological parameters. “If D3M fails in certain ways in extrapolation, maybe it can give us hints about why it works in the first place,” Ho says.

Unfortunately, the use of such advanced techniques in astronomy, astrophysics and cosmology is fomenting a divide. “It’s creating a little bit of a have-and-have-nots [situation] in our community,” Bloom says. “There are those that are becoming more fluent and capable in the language of moving data and doing inference on data and those that are not.”

As the “haves” continue to develop better and better machine-learning systems, there is the tantalizing prospect that, in the future, these algorithms will learn directly from data produced by telescopes and then make inferences, without the need for training using simulated or precategorized data—somewhat like the successor to the human-conquering AlphaGo. While AlphaGo had to be taught using human-generated data, the newer version, AlphaGo Zero, taught itself how to play Go without any data from human games. If astrophysics goes the same route, the black box may become blacker.