The Argonne Leadership Computing Facility (ALCF), a U.S. Department of Energy (DOE) Office of Science User Facility, has selected 10 data science and machine learning projects for its Aurora Early Science Program (ESP). Set to be the nation’s first exascale system upon its expected 2021 arrival, Aurora will be capable of performing a quintillion calculations per second.

The Aurora ESP, which commenced with 10 simulation-based projects in 2017, is designed to prepare key applications, libraries, and infrastructure for the architecture and scale of the exascale supercomputer.

The 10 new data and learning projects were selected to support the ALCF’s new paradigm for scientific computing, which expands on traditional simulation-based research to include data science and machine learning approaches.

The ESP projects originate from universities and national laboratories across the country and span a wide range of disciplines that cover key scientific areas and numerical methods. The teams will receive hands-on assistance to port and optimize their applications for the new architecture using systems available today and early Aurora hardware when it is available.

“The Aurora ESP continues the ALCF tradition of delivering science on day one while helping to lay the path for hundreds of future users,” says Tim Williams, Deputy Director of Argonne’s Computational Science Division. “It’s an exciting opportunity for researchers, who get to be among the first people in the world to run code on an exascale system.”

Previous Early Science programs helped usher in earlier ALCF supercomputers, including the Intel-Cray system Theta and the IBM Blue Gene/Q system Mira, both of which continue to serve the scientific research community today.

The ALCF will host numerous training events to help the Aurora ESP project teams prepare their codes for the coming system, with assistance from Intel and Cray. Each Early Science team is also paired with a dedicated postdoctoral researcher from the ALCF.

Aurora ESP Data and Learning Projects

Exascale Computational Catalysis

David Bross, Argonne National Laboratory

Chemical transformation technologies are present in virtually every sector, and their continued advancement requires a molecular-level understanding of underlying chemical processes. This project will facilitate and accelerate the quantitative description of crucial gas-phase and coupled heterogeneous catalyst/gas-phase chemical systems through the development of data-driven tools designed to revolutionize predictive catalysis and address DOE grand challenges.

Machine Learning for Lattice Quantum Chromodynamics

William Detmold, Massachusetts Institute of Technology

This project will determine possible interactions between nuclei and a large class of dark matter candidate particles. By coupling advanced machine learning and state-of-the-art physics simulations, it will provide critical input for experimental searches aiming to unravel the mysteries of dark matter while simultaneously giving insight into fundamental particle physics.

Enabling Connectomics at Exascale to Facilitate Discoveries in Neuroscience

Nicola Ferrier, Argonne National Laboratory

This project will develop a computational pipeline for neuroscience that will extract brain-image-derived mappings of neurons and their connections from electron microscope datasets too large for today’s most powerful systems. Ultimately the pipeline will be used to analyze an entire cubic centimeter of electron microscopy data.

Dark Sky Mining

Salman Habib, Argonne National Laboratory

This project will connect some of the world’s largest and most detailed extreme-scale cosmological simulations with large-scale data obtained from the Large Synoptic Survey Telescope, the most comprehensive observations of the visible sky. By implementing cutting-edge data-intensive and machine learning techniques, it will usher in a new era of cosmological inference targeted at scientific breakthroughs.

Data Analytics and Machine Learning for Exascale Computational Fluid Dynamics

Ken Jansen, University of Colorado Boulder

This project will develop data analytics and machine learning techniques to greatly enhance the value of flow simulations with the extraction of meaningful dynamics information. A hierarchy of turbulence models will be applied to a series of increasingly complex flows before culminating in the first flight-scale design optimization of active flow control on an aircraft’s vertical tail.

Many-Body Perturbation Theory Meets Machine Learning to Discover Singlet Fission Materials

Noa Marom, Carnegie Mellon University

Supercomputers have been guiding materials discovery for the creation of more efficient organic solar cells. By combining quantum-mechanical simulations with machine learning and data science, this project will harness exascale power to revolutionize the process of photovoltaic design and advance physical understanding of singlet fission, the phenomenon whereby one photogenerated singlet exciton is converted into two triplet excitons—increasing the electricity produced.

Simulating and Learning in the ATLAS Detector at the Exascale

James Proudfoot, Argonne National Laboratory

The ATLAS experiment at the Large Hadron Collider measures particles produced in proton-proton collision as if it were an extraordinarily rapid camera. These measurements led to the discovery of the Higgs boson, but hundreds of petabytes of data still remain unexamined, and the experiment’s computational needs will grow by an order of magnitude or more over the next decade. This project deploys necessary workflows and updates algorithms for exascale machines, preparing Aurora for effective use in the search for new physics.

Extreme-Scale In-Situ Visualization and Analysis of Fluid-Structure-Interaction Simulations

Amanda Randles, Duke University and Oak Ridge National Laboratory

This project advances the use of data science to drive analysis of extreme-scale fluid-structure-interaction simulations so as to develop our understanding of the role biological parameters play in determining tumor cell trajectory in the circulatory system. A cellular-level model of systemic-scale flow represents a critical step towards elucidating the mechanisms driving cancer metastasis.

Virtual Drug Response Prediction

Rick Stevens, Argonne National Laboratory

Utilizing data frames too large for conventional systems and a deep learning workflow designed to provide new approaches to personalized cancer medicine, this project enables billions of virtual drugs to be screened singly and in numerous combinations, while predicting their effects on tumor cells. The workflow is built from the CANDLE (CANcer Distributed Learning Environment) framework to optimize model hyper-parameters and perform billions of inferences to quantify model uncertainty and ultimately deliver results to be tested in pre-clinical experiments.

Accelerated Deep Learning Discovery in Fusion Energy Science

William Tang, Princeton Plasma Physics Laboratory

Machine learning and artificial intelligence can demonstrably accelerate scientific progress in predictive modeling for grand challenge areas such as the quest for clean energy via fusion power. This project seeks to expand modern convolutional and recurrent neural net software to carry out optimized hyperparameter tuning on exascale supercomputers to make strides toward validated prediction and associated mitigation of large-scale disruptions in burning plasmas such as ITER.

Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation's first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America's scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy's Office of Science.

The U.S. Department of Energy's Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit the Office of Science website.