Astronomers are enlisting the help of machines to sort through thousands of stars in our galaxy and learn their sizes, compositions and other basic traits.

The research is part of the growing field of machine learning, in which computers learn from large data sets, finding patterns that humans might not otherwise see. Machine learning is in everything from media-streaming services that predict what you want to watch, to the post office, where computers automatically read handwritten addresses and direct mail to the correct zip codes.

Now astronomers are turning to machines to help them identify basic properties of stars based on sky survey images. Normally, these kinds of details require a spectrum, which is a detailed sifting of the starlight into different wavelengths. But with machine learning, computer algorithms can quickly flip through available stacks of images, identifying patterns that reveal a star's properties. The technique has the potential to gather information on billions of stars in a relatively short time and with less expense.

"It's like video-streaming services not only predicting what you would like to watch in the future, but also your current age, based on your viewing preferences," said Adam Miller of NASA's Jet Propulsion Laboratory in Pasadena, California, lead author of a new report on the findings appearing in the Astrophysical Journal. "We are predicting fundamental properties of the stars."

Miller presented the results today at the annual American Astronomical Society meeting in Seattle.

Machine learning has been applied to the cosmos before; what makes this latest effort unique is that it is the first to predict specific traits of stars, such as size and metal content, using images of those stars taken over time. These traits are essential to learning about when a star was born, and how it has changed since that time.

advertisement

"With more information about the different kinds of stars in our Milky Way galaxy, we can better map the galaxy's structure and history," said Miller.

Every night, telescopes around the world obtain thousands of images of the sky. The flood of new data is only expected to rise with upcoming wide-field surveys like the Large Synoptic Survey Telescope (LSST), a National Science Foundation and Department of Energy project that will be based in Chile. That survey will image the entire visible sky every few nights, gathering data on billions of stars and how some of those stars change in brightness over time. NASA's Kepler mission has already captured the same kind of time-varying data on hundreds of thousands of stars.

Humans alone can't easily make sense of all this data. That is where machines, or in this case, computers using specialized algorithms, can help out.

But before the machines can learn, they first need a "training period." Miller and his colleagues started with 9,000 stars as their training set. They obtained spectra for these stars, which revealed several of their basic properties: sizes, temperatures and the amount of heavy elements, such as iron. The varying brightness of the stars had also been recorded by the Sloan Digital Sky Survey, producing plots called light curves. By feeding the computer both sets of data, it could then make associations between the star properties and the light curves.

Once the training phase was over, the computer was able to make predictions on its own about other stars by only analyzing light-curves.

advertisement

"We can discover and classify new types of stars without the need for spectra, which are expensive and time-consuming to obtain," said Miller.

The technique essentially works in the same way as email spam filters. The spam filters are programmed to identify key words associated with junk mail, and then remove the unwanted emails containing those words. With time, a user continues to "teach" the filtering program more key words, and the program becomes better at filtering spam. The machine learning program used by Miller and collaborators likewise becomes better at accurately predicting properties of the stars with additional training from the astronomers.

The team's next goal is to get their computers smart enough to handle the more than 50 million variable stars that the LSST project will observe.

"This is an exciting time to be applying advanced algorithms to astronomy," said Miller. "Machine learning allows us to mine for rare and obscure gems within the deep data sets that astronomers are only now beginning to acquire."

The California Institute of Technology manages JPL for NASA.