A new breed of scientist, with brains of silicon

EMERYVILLE, CALIFORNIA—If this is the biology laboratory of the future, it doesn’t look so different from today’s. Scientists in white lab coats walk by with boxes of frozen tubes. The chemicals on the shelves—bottles of pure alcohol, bins of sugar, protein, and salts—are standard issue for growing microbes and manipulating their genes. You don’t even notice the robots until you hear them: They sound like crickets singing to each other amid the low roar of fans.

The robots work for Zymergen, a biotechnology company that moved into this former electronics factory on the eastern shore of California’s San Francisco Bay in 2014. They spend their days carrying out experiments on microbes, searching for ways to increase the production of useful chemicals. Here’s one called Echo. Nestled within a blocky jumble of equipment, a robotic arm grabs a plastic block dimpled with hundreds of tiny wells carrying liquid. A laser scans a barcode on the block’s side before Echo loads it into a tray. What happens next is too subtle for the human eye to perceive.

“This isn’t a replica of how I would do pipetting with my hand,” says one of the company’s co-founders, Jed Dean, a molecular biologist and vice president of operations and engineering. “It’s an entirely different way of doing it.” Instead of using a pipette to suck up and squirt microliters of liquid into each well—a tidal wave of volume on the cellular scale—the robot never touches it. Instead, 500 times per second, a pulse of sound waves causes the liquid itself to ripple and launch a droplet a thousand times smaller than one a human can transfer.

Yet none of that is the really futuristic part. Big bio labs have used robots and barcodes for years. Even the liquid-moving technology—called acoustic droplet ejection—has existed for decades. The real giveaway comes when I ask Dean what experiment this robot is working on right now. “I have no idea,” he says. He could easily find out, but he didn’t design the experiment. Instead, it was the output of a computer program.

“I want to be very clear,” says Zymergen CEO Joshua Hoffman, heading off a persistent misunderstanding. “There is a human scientist in the loop, looking at the results and reality checking them.” But for interpreting data, generating hypotheses, and planning experiments, he says, the ultimate goal is “to get rid of human intuition.”

Zymergen is one of several companies with the same goal: harnessing artificial intelligence (AI) to augment—or even replace—the role of humans in the scientific process. “AI-powered biotech” is how it has been described, but Zymergen’s co-founders cringe at the term. “‘AI’ sounds like a robot playing chess,” says Aaron Kimball, the company’s chief technical officer. “I’m comfortable with ‘ML,’” Hoffman says, referring to machine learning, the branch of computer science that accounts for nearly all recent progress in AI. “That gets at what we do.”

Automating discovery Science is a sequence. Day in and day out, the work in a laboratory isn’t so different from that in a factory. Papers come in and inspire new experiments, which lead to new findings—and result in new papers. Various companies and research organizations are building tools based on robotics and artificial intelligence (AI) that enhance or even replace the role of humans each step of the way

What Zymergen actually does is tune up industrial microbes that produce ingredients for biofuels, plastics, or drugs. Seeking to boost production, companies send their workhorse strains to Zymergen. The robots then explore and tinker with each microbe’s genome in a bid to engineer a version that makes its product compound more efficiently.

The problem is that the microbes that arrive at Zymergen are already “highly optimized,” Hoffman says. After years of research and breeding, the cells are very good at what they do. So squeezing out more efficiency requires exploring the genome deeply, conducting experiments, and following the data wherever they lead—doing science, in other words.

Zymergen is trying to accelerate that science. In traditional biology, Hoffman says, “you’ve got a person standing at a bench testing a limited number of hypotheses. Call it 10 per month.” Robots can do that physical part of the process faster—the machines at Zymergen run as many as 1000 experiments per week. But robots only follow orders: Giving them the right orders is the real bottleneck.

When I ask how his algorithms design experiments, Kimball begins with a simple premise. “You’ve got the original microbe here with about 5000 genes. Let’s say there are 10 ways you could change a given gene. So that’s 50,000 things you could be doing.” The experimental “campaign” begins by creating 1000 strains, each with a single deliberate mutation, he says. “Each one lives in a droplet. You feed it sugar, let it cook for a while, and then measure how much product you get.” Maybe 25 strains will produce slightly more of the target chemical. Those strains become breeding stock for the next round of experiments, and the rest go into the freezer.

But the path to discovery is anything but straight. Finding just the right combination of mutations requires a long, tortuous exploration of the genetic “landscape,”

Kimball says. And just blindly walking uphill toward peaks of efficiency almost never leads to a major summit. That’s because if you just combine all the mutations yielding small improvements into a single microbe, they don’t add up to a big gain. Instead, the microbe becomes “sick,” he says, far less fit than the original strain. So choosing the right path, including detours into promising valleys, requires a mental map showing all the effects of all the mutations at once—a map with not just three dimensions, but thousands. Machine learning is needed to stay oriented.

But here’s the key difference: When the robots do finally discover the genetic changes that boost chemical output, they don’t have a clue about the biochemistry behind their effects.

Is it really science, then, if the experiments don’t deepen our understanding of how biology works? To Kimball, that philosophical point may not matter. “We get paid because it works, not because we understand why.”

So far, Hoffman says, Zymergen’s robotic lab has boosted the efficiency of chemical-producing microbes by more than 10%. That increase may not sound like much, but in the $160-billion-per-year sector of the chemical industry that relies on microbial fermentation, a fractional improvement could translate to more money than the entire $7 billion annual budget of the National Science Foundation. And the advantageous genetic changes that the robots find represent real discoveries, ones that human scientists probably wouldn’t have identified. Most of the output-boosting genes are not directly related to synthesizing the desired chemical, for instance, and half have no known function. “I’ve seen this pattern now in several different microbes,” Dean says. Finding the right genetic combinations without machine learning would be like trying to crack a safe with thousands of numbers on its dial. “Our intuitions are easily overwhelmed by the complexity,” he says.

Just how much of science can be delegated to machine-learning systems depends on whom you ask. “A lot,” says Ilias Tagkopoulos, a computer scientist at the University of California, Davis, who researches genomics. “There is no reason that we cannot let the data dictate what experiment we should do next, to maximize the information gain and come closer to our goals.” His seemingly endless list of applications includes predicting how bacteria will evolve in a changing hospital environment and designing better snack food—essentially any complex optimization problem for which improvement is well-defined.

If machines really are poised to replace humans in some scientific tasks, many scientists will embrace them. Unlike factory workers or taxi drivers, most research scientists would love to automate parts of their jobs. That’s especially true for molecular and cellular biology, in which the manual labor—squirting liquids, plating cells, counting colonies—is tedious and expensive. A graduate student’s tiniest mistake or imprecision can waste weeks of work. Even worse is a sloppy decision by the postdoc who designed the experiments for that student, wasting months of effort.

Yet some biologists describe frustration after enlisting AI to interpret data and design experiments. “We’re finding that current machine-learning methods are still not quite up to the task,” says Rhiju Das, a computational biochemist at Stanford University in Palo Alto, California, who studies how molecules fold in order to design new drugs. “They fail horribly in RNA design problems compared to [humans] who have access to the same data.” Although he doesn’t know why, tasks that involve “design” seem to require human intuition. Maybe Zymergen has stumbled on the rare part of biology that is well-suited to computer-controlled experimentation.

Max Hodak, who co-founded Transcriptic in Menlo Park, California—another biotech company exploring automation—also sees limits to the approach. He is confident that robots will take much of the tedium out of lab work. Soon, he says, “if you’re still using your hands, you won’t be doing science.” But the brain of the biologist won’t be replaced anytime soon, simply because the natural world is so complex. Evolution, Hodak says, “is responsible for the richness of biology, and it’s also why it’s so hard to understand. It’s irreducible complexity.” AI could give biologists limited help in designing better experiments, Hodak says, but he worries that outsourcing any more of the scientific process will prove “much more complicated than we expect.”

And even if AI-controlled research works, will humans understand what the computer discovers? The calculations behind a result may remain a black box. “An intriguing possibility is that we’re closing the era of ‘comprehensible’ science,” says Adrien Treuille, a computer scientist at Carnegie Mellon University in Pittsburgh, Pennsylvania, who works with molecular biologists. Researchers may come to rely on computers not only to do the science, but also to explain it: Some evidence for biological theories may be so complex that accepting it requires faith in the computation.

In that case, should scientists include their computers as co-authors on their papers? “I wouldn’t do that,” says Michael Schmidt, CEO of Nutonian, a Boston-based company applying AI to scientific discovery. But then he hedges: “Well, when they can read and make sense of the paper, then they can be an author.”