In eye care, artificial intelligence systems have shown they can match the accuracy of doctors in diagnosing specific diseases. But a new system designed by Google DeepMind and British doctors goes a crucial step further: It can show users how it reached its conclusions.

A study published Monday in Nature Medicine reports that the DeepMind system can identify dozens of diseases and point out the portions of optical coherence tomography scans that it relies upon to make its diagnoses. That’s a crucial factor in validating the safety and efficacy of AI technologies being developed for use in diagnosing or recommending treatments for a broad range of diseases, from cancer to neurological and vision problems.

The paper states that the system made the right referral recommendation in more than 94 percent of cases based on a review of historic patient scans at Moorfields Eye Hospital in London and performed as good as, or better than, top eye specialists who examined the same scans. Experts said that level of accuracy is impressive on such an open-ended query. But the bigger breakthrough is the system’s solution to the so-called “black box” problem of artificial intelligence, which refers to the inability of such systems to explain their thinking.

advertisement

The opacity of AI systems is an impediment to their adoption by health providers who want the ability to understand their rationale and training, and ensure recommendations are based on science instead of supposition. In cancer care, for example, IBM’s Watson for Oncology has faced criticism for a lack of clarity surrounding the system’s training and the basis of its treatment recommendations.

“Doctors and patients don’t want just a black box answer, they want to know why,” said Ramesh Raskar, an associate professor at the Massachusetts Institute of Technology who studies computational imaging. “There is a standard of care, and if the AI technique doesn’t follow that standard of care, people are going to be uncomfortable with it, even if it’s the smartest thing in the world.”

advertisement

The new study, conducted by DeepMind researchers and British eye doctors, is the latest of many to describe an advancement in the use of artificial intelligence in medicine. Technology giants such as Apple, Amazon, IBM, and Microsoft are all developing products for use in health care. Drug makers are looking to use AI to speed up the development of new treatments, while hospitals are deploying it to improve the efficiency of scheduling and coordination of care, and also want to use it to help doctors diagnose diseases and recommend treatments more accurately and faster.

Startups and individual doctors are also developing AI-based systems; another paper in Nature Medicine on Monday describes a technology tested by researchers at the Icahn School of Medicine at Mount Sinai in New York that uses a deep neural network to identify evidence of acute neurological events such as strokes in CT scans, which could improve outcomes by speeding up diagnosis and treatment.

Much of the recent research and development in AI has focused on eye care, an area where the heavy use of scans — and a shortage of specialists in some regions — makes it ripe for such technology. In April, the Food and Drug Administration approved the first software that can diagnose diabetic retinopathy, a common eye disease that afflicts patients with diabetes. That product, called IDx-DR, can identify the disease without a clinician’s involvement.

In addition to its transparency, the system involved in the DeepMind Health study stands out because it is not focused on a single condition, but can identify 50 different eye diseases based on its review of imaging data. It analyzes three-dimensional scans, making it capable of processing more data than prior AI systems that relied on two-dimensional images. The study authors also demonstrated their system can achieve a high level of accuracy on multiple types of scanning machines with a limited amount of additional training.

The system employs a novel architecture that uses two neural networks — the first network translates raw ocular CT scans into a tissue map, and the second analyzes the map to identify symptoms of eye disease. A user can watch a video showing the portions of the scan it uses to reach its conclusions as well as the confidence levels it assigns to each possible diagnosis.

While the performance of this system is promising, it is not going to be used any time soon in hospitals or eliminate the need for human specialists to review scans. The study authors said the system still needs years of refinement and testing, including a randomized controlled trial, before it could be used in clinical care. Even then, it would still require some degree of human oversight.

“The key thing is that we do prospective studies in multiple different locations before we actually let this loose on patients,” said Dr. Pearse Keane, an ophthalmologist at Moorfields Eye Hospital and a co-author of the study. “We all think this technology could be transformative, but we also acknowledge that it’s not magic and we have to apply the same level of rigor to it that we would apply to any intervention.”

The level of evidence needed to deploy AI-based diagnostic or treatment advisers remains a matter of debate and varies widely by the type of product and the company selling it.

The approach of DeepMind and Moorfields contrasts with that of IBM and its clinical partner, Memorial Sloan Kettering Cancer Center of New York. IBM has aggressively marketed and sold its Watson for Oncology system to hospitals across Asia without publishing prospective studies about its impact on the decision-making of doctors or outcomes of patients. It has done so despite criticism from its own oncologists about the lack of evidence and complaints from clients who have cited examples of biased and inaccurate recommendations.

While AI is still largely experimental in medicine, it offers huge potential for changing the delivery of care. Optical CT is one of the most common imaging procedures in the U.S, with more than 5.35 million scans performed on Medicare beneficiaries in 2014. In some areas, a shortage of specialists is making it more difficult to provide accurate diagnosis and timely care for patients facing vision loss.

“We all think this technology could be transformative, but we also acknowledge that it’s not magic and we have to apply the same level of rigor to it that we would apply to any intervention.” Dr. Pearse Keane, ophthalmologist and study co-author

In the United Kingdom, the shortage of specialists is more severe than it is in the United States. More than 2 million people are living with sight loss in the U.K., a number that is expected to double by 2050, according to the study. That means some patients with serious conditions end up waiting many weeks, or even months, to get treated.

Keane referenced an experience with a patient who was facing a total loss of sight due to macular degeneration, the leading cause of vision loss. She sought an urgent medical appointment when she began experiencing problems with her remaining good eye, but had to wait six weeks to see a specialist. “Clearly there are huge capacity issues that are faced all around the world,” Keane said. “If it was my mother or a family member of mine, I would want them seen within six days, not six weeks.”

The artificial intelligence system used in the study demonstrated potential for reducing the backlog through automation. In the study, it was tasked with diagnosing 50 different conditions and triaging patients using Moorfields’ referral system. Its accuracy was tested on historic optical CT scans of 997 patients whose scans and files were also examined by eight human experts.

Newsletters Sign up for Morning Rounds Your daily dose of news in health and medicine. Please enter a valid email address. Privacy Policy Leave this field empty if you're human:

On the most urgent referrals, the computer matched the accuracy of the two top retinal specialists and performed better than two other specialists and four optometrists. Across all cases, it had a slightly lower error rate (5.5 percent) than the top two human specialists (6.7 percent and 6.8 percent) when the human experts were limited to a review of the scan results. Several of the specialists nearly matched the computer’s performance when they were able to review patient notes and other supplemental materials, according to the study.

Dr. John Miller, an ophthalmologist at Massachusetts Eye and Ear hospital who was not involved in the study, said the biggest question facing the system in the study is who will be able to use it — primary care doctors, pharmacists, or specialists.

It could prove useful in all those settings, Miller said, noting that its use at the primary care level could be especially useful by streamlining the referral system to help patients get timely care. “If we can be confident in a system that can identify retinal-specific disease at an early stage, that can prompt an earlier appointment for the patient and potentially save sight,” he said.

Miller added that its use also might save time, money, and worry wasted on incorrect diagnosis — a situation that arises in 10 to 25 percent of the cases he sees at Massachusetts Eye and Ear. “I think it’s going to help us see more of the right types of patients, instead of screening some patients without the suggested disease,” he said. “I view it as augmenting my practice, not threatening it.”