Most smart software in use today specializes on one type of data, be that interpreting text or guessing at the content of photos. Software in development at IBM has to do all those at once. It’s in training to become a radiologist’s assistant.

The software is code-named Avicenna, after an 11th century philosopher who wrote an influential medical encyclopedia. It can identify anatomical features and abnormalities in medical images such as CT scans, and also draws on text and other data in a patient’s medical record to suggest possible diagnoses and treatments.

Avicenna is intended to be used by cardiologists and radiologists to speed up their work and reduce errors, and is currently specialized to cardiology and breast radiology. It is currently being tested and tuned up using anonymized medical images and records. But Tanveer Syeda-Mahmood, a researcher at IBM’s Almaden research lab near San Jose, California, and chief scientist on the project, says that her team and others in the company are already getting ready to start testing the software outside the lab on large volumes of real patient data. “We’re getting into preparations for commercialization,” she says.

Avicenna “looks” at medical images using a suite of different image-processing algorithms with different specialties. Some have been trained to judge how far down a patient’s chest a CT scan slice is from, for example. Others can identify the organs or label abnormalities such as blood clots. Some of these image components use a technique called deep learning, which has recently produced major leaps in the accuracy of image recognition software (see “AI Advances Make It Possible to Search, Shop with Images”).

The image-processing algorithms work alongside others that have been trained to interpret text and test results in medical records. Avicenna has a “reasoning” system that draws on the output from all those different signals to suggest possible diagnoses for a patient. It shows a summary of that reasoning to the person working with the software.

IBM’s Avicenna software highlighted possible embolisms on this CT scan in green, finding mostly the same problems as a human radiologist who marked up the image in red.

In a demo of the system, Syeda-Mahmood showed Avicenna taking on the case of a 28-year-old woman complaining of shortness of breath. The patient’s medical record included pulmonary angiogram images of the blood vessels around her lungs, some blood tests, and text noting that her mother had experienced multiple miscarriages.

Avicenna knew that family history can be associated with a tendency to form blood clots, which can lead to miscarriages, knowledge that changed how it analyzed the angiogram images. The software suggested pulmonary embolism as the most likely diagnosis, and highlighted several possible embolisms in the patient’s left and right pulmonary arteries. When a radiologist independently reviewed the same case, he or she made the same diagnosis and highlighted more or less exactly the same embolisms.

IBM’s are not the only researchers trying to build software that combines text and other medical record data to work like a radiologist. But Kenji Suzuki, an associate professor at the medical imaging research center at the Illinois Institute of Technology, says that IBM’s commercial ambitions for Avicenna are unique. “No other company is attempting or envisioning that total integration of text, structured data, and medical imaging,” he says.

However, based on what he has seen from the project so far, Suzuki says that Avicenna’s image processing and diagnostic powers still need to become more accurate and flexible. Even after making those improvements, to make significant sales, IBM will need to have its automated helper integrate with existing hospital IT systems and prove it has economic benefits—“such as reducing the total hospital cost, insurance reimbursement, or the risks of lawsuits,” says Suzuki.

Syeda-Mahmood says that making Avicenna more accurate is one of her team’s top priorities, although the goal is to help radiologists, not replace them. And she believes IBM has an advantage over others trying to build this kind of software.

That's because making a machine-learning system more accurate requires feeding it lots of example data to tune its abilities. IBM has already amassed a very large collection of medical images and records and is in the process of making it much larger.

Last year the company acquired a collection of billions of medical images when it purchased the company Merge Healthcare. Those images are not yet available to Avicenna, but when they are, they could help make the software more accurate, says Syeda-Mahmood. The project may also get a boost from 50 million anonymized electronic health records that IBM received in the acquisition last year of a startup called Explorys.