As chemistry has gotten more advanced and the chemical reactions more complex, it's no longer always practical for researchers to sit down at a lab bench and start mixing chemicals to see what they can come up with.

Tom Miller, a professor of chemistry at Caltech; Matt Welborn, a postdoctoral scholar at the Resnick Sustainability Institute; and Lixue Cheng, a chemistry and chemical engineering graduate student, have developed a new tool that uses machine learning to predict chemical reactions long before reagents hit the test tube.

Theirs isn't the first computational tool developed to make chemistry predictions, but it does improve on what is already in use, and that matters because these sorts of predictions are having a big impact in the field.

"They allow us to connect underlying microscopic properties to the things we care about in the macroscopic world," Miller says. "These predictions allow us to know ahead of time if one catalyst will perform better than another one and to identify new drug candidates."

They also require a lot of computational heavy lifting. Miller points out that a substantial fraction of all supercomputer time on Earth is dedicated to chemistry predictions, so increases in efficiency can save researchers a lot of time and expense.

The work of the Caltech researchers essentially provides a change of focus for prediction software. Previous tools were based around three computational modeling methods known as density functional theory (DFT), coupled cluster theory (CC), or Møller–Plesset perturbation theory (MP2). Those theories represent three different approaches to approximating a solution to the Schrödinger equation, which describes complex systems in which quantum mechanics plays a big role.

Each of those theories has its own advantages and disadvantages. DFT is something of a quick-and-dirty approach that gives researchers answers more quickly but with less accuracy. CC and MP2 are much more accurate but take longer to calculate and use a lot more computing power.

Miller, Cheng, and Welborn's tool threads the needle, giving them access to predictions that are more accurate than those created with DFT and in less time than CC and MP2 can offer. They do this by focusing their machine-learning algorithm on the properties of molecular orbitals—the cloud of electrons around a molecule. Already existing tools, in contrast, focus on the types of atoms in a molecule or the angles at which the atoms are bonded together.

So far, their approach is showing a lot of promise, though it's only been used to make predictions about relatively simple systems. The true test, Miller says, is to see how it will perform on more complicated chemical problems. Still, he's optimistic on the basis of the preliminary results.

"If we can get this to work, it will be a big deal for the way in which computers are used to study chemical problems," he says. "We're very excited about it."

The work is described in a paper titled "Transferability in Machine Learning for Electronic Structure via the Molecular Orbital Basis" appearing in the Journal of Chemical Theory and Computation. Support was supplied by the Air Force Office of Scientific Research, a Resnick Sustainability Institute Postdoctoral Fellowship, and a Camille Dreyfus Teacher-Scholar Award.