One of the biggest obstacles is the lack of diagnostic testing, says Gallivan. “Ideally, we would have a test to detect the novel coronavirus immediately and be testing everyone at least once a day,” he says. We also don’t really know what behaviors people are adopting—who is working from home, who is self-quarantining, who is or isn’t washing hands—or what effect it might be having. If you want to predict what’s going to happen next, you need an accurate picture of what’s happening right now.

It’s not clear what’s going on inside hospitals, either. Ahmer Inam at Pactera Edge, a data and AI consultancy, says prediction tools would be a lot better if public health data wasn’t locked away within government agencies as it is in many countries, including the US. This means an AI must lean more heavily on readily available data like online news. “By the time the media picks up on a potentially new medical condition, it is already too late,” he says.

But if AI needs much more data from reliable sources to be useful in this area, strategies for getting it can be controversial. Several people I spoke to highlighted this uncomfortable trade-off: to get better predictions from machine learning, we need to share more of our personal data with companies and governments.

Darren Schulte, an MD and CEO of Apixio, which has built an AI to extract information from patients’ records, thinks that medical records from across the US should be opened up for data analysis. This could allow an AI to automatically identify individuals who are most at risk from Covid-19 because of an underlying condition. Resources could then be focused on those people who need them most. The technology to read patient records and extract life-saving information exists, says Schulte. The problem is that these records are split across multiple databases and managed by different health services, which makes them harder to analyze. “I’d like to drop my AI into this big ocean of data,” he says. “But our data sits in small lakes, not a big ocean.”

Health data should also be shared between countries, says Inam: “Viruses don’t operate within the confines of geopolitical boundaries.” He thinks countries should be forced by international agreement to release real-time data on diagnoses and hospital admissions, which could then be fed into global-scale machine-learning models of a pandemic.

Of course, this may be wishful thinking. Different parts of the world have different privacy regulations for medical data. And many of us already balk at making our data accessible to third parties. New data-processing techniques, such as differential privacy and training on synthetic data rather than real data, might offer a way through this debate. But this technology is still being finessed. Finding agreement on international standards will take even more time.

For now, we must make the most of what data we have. Wang’s answer is to make sure humans are around to interpret what machine-learning models spit out, making sure to discard predictions that don’t ring true. “If one is overly optimistic or reliant on a fully autonomous predictive model, it will prove problematic,” he says. AIs can find hidden signals in the data, but humans must connect the dots.

Early diagnosis

As well as predicting the course of an epidemic, many hope that AI will help identify people who have been infected. AI has a proven track record here. Machine-learning models for examining medical images can catch early signs of disease that human doctors miss, from eye disease to heart conditions to cancer. But these models typically require a lot of data to learn from.

A handful of preprint papers have been posted online in the last few weeks suggesting that machine learning can diagnose Covid-19 from CT scans of lung tissue if trained to spot telltale signs of the disease in the images. Alexander Selvikvåg Lundervold at the Western Norway University of Applied Sciences in Bergen, Norway, who is an expert on machine learning and medical imaging, says we should expect AI to be able to detect signs of Covid-19 in patients eventually. But it is unclear whether imaging is the way to go. For one thing, physical signs of the disease may not show up in scans until some time after infection, making it not very useful as an early diagnostic.

Dr. Fan Zhongjie, a respiratory specialist in charge of critical COVID-19 patients in central China's Hubei province, reads a CT scan image. AP Images

What’s more, since so little training data is available so far, it’s hard to assess the accuracy of the approaches posted online. Most image recognition systems—including those trained on medical images—are adapted from models first trained on ImageNet, a widely used data set encompassing millions of everyday images. “To classify something simple that's close to ImageNet data, such as images of dogs and cats, can be done with very little data,” says Lundervold. “Subtle findings in medical images, not so much.”

That’s not to say it won’t happen—and AI tools could potentially be built to detect early stages of disease in future outbreaks. But we should be skeptical about many of the claims of AI doctors diagnosing Covid-19 today. Again, sharing more patient data will help, and so will machine-learning techniques that allow models to be trained even when little data is available. For example, few-shot learning, where an AI can learn patterns from only a handful of results, and transfer learning, where an AI already trained to do one thing can be quickly adapted to do something similar, are promising advances—but still works in progress.

Cure-all

Data is also essential if AI is to help develop treatments for the disease. One technique for identifying possible drug candidates is to use generative design algorithms, which produce a vast number of potential results and then sift through them to highlight those that are worth looking at more closely. This technique can be used to quickly search through millions of biological or molecular structures, for example.

SRI International is collaborating on such an AI tool, which uses deep learning to generate many novel drug candidates that scientists can then assess for efficacy. This is a game-changer for drug discovery, but it can still take many months before a promising candidate becomes a viable treatment.

In theory, AIs could be used to predict the evolution of the coronavirus too. Inam imagines running unsupervised learning algorithms to simulate all possible evolution paths. You could then add potential vaccines to the mix and see if the viruses mutate to develop resistance. “This will allow virologists to be a few steps ahead of the viruses and create vaccines in case any of these doomsday mutations occur,” he says.

It’s an exciting possibility, but a far-off one. We don’t yet have enough information about how the virus mutates to be able to simulate it this time around.

In the meantime, the ultimate barrier may be the people in charge. “What I’d most like to change is the relationship between policymakers and AI,” says Wang. AI will not be able to predict disease outbreaks by itself, no matter how much data it gets. Getting leaders in government, businesses, and health care to trust these tools will fundamentally change how quickly we can react to disease outbreaks, he says. But that trust needs to come from a realistic view of what AI can and cannot do now—and what might make it better next time.

Making the most of AI will take a lot of data, time, and smart coordination between many different people. All of which are in short supply right now.