LOOKING for needles in haystacks is boring. But computers do not get bored. Contracting out to machines the tedious business of assessing the dangerousness of cancer cells in histological microscope slides ought thus to be an obvious thing to do. Cervical-cancer smear tests aside, however, such electronic intrusions into the pathology laboratory are limited. Grading cancer cells into “indolent” and “aggressive”, and hazarding an opinion about whether they spell a treatable condition or an untreatable one, has remained the realm of the human expert. But not for much longer, if Daphne Koller, a computer scientist at Stanford University, and her colleagues have their way. They report in this week's Science Translational Medicine that they have written a program which can distinguish between grades of breast-cancer cell—and can do so in a way that provides a more accurate prognosis than a human pathologist can manage.

Previous attempts to build a computerised pathologist of this sort involved the designers carefully specifying which characteristics of the samples being examined were most important. For example, they would tell the computer to measure the three traits human pathologists use to determine a tumour's grade: the percentage of its cells that are tubelike; the diversity of appearance of the cell nuclei; and the proportion of cancer cells undergoing division. However, people are excellent at pattern recognition and skilled pathologists rely not just on these relatively-easy-to-describe traits, but also on less well defined characters that years of experience have taught them are significant too. Restricting computerised pathologists to the well-characterised bits of the process therefore inevitably results in worse performance than their human counterparts show.

Dr Koller's Computational Pathologist (C-Path), by contrast, lets the system work out for itself what the most important features of a tumour are. She and her colleagues started by setting down 6,642 characters the program might choose from when it assessed images of biopsies from breast-cancer patients, but did not tell it which to prefer. Some of the characters they offered were inherent to the cancer cells. Others were features of the surrounding “stromal” cells, which are not, themselves, malignant, but act to support a tumour. And some were not features of individual cells at all but, rather, measured relations between cells (for example, the average distance between cancer-cell nuclei) and the context cells found themselves in (for example, whether they occurred in large clusters or were frequently interspersed with stroma).

The team initially trained and tested the program on 248 breast-cancer samples from the Netherlands Cancer Institute. It was fed with images of slides from these patients, and also information on how long each patient had survived after the sample being examined had been taken. That done, they then tested it on a second set of samples, this time from 286 breast-cancer patients at Vancouver General Hospital. They found it was able both to grade the slides and to predict—in a way human pathologists could not—whether patient would survive for five years after treatment.

When Dr Koller looked at which 11 features were the most robust predictors of survival, she discovered that only eight were characteristic of the tumour cells themselves. The other three were stromal characters. The fact that three stromal features were on the list suggests that the stroma influences whether or not a cancer progresses and kills the patient. That is important information because, hitherto, pathologists have focused on the cancer cells themselves and ignored the stroma. Thus C-Path seems not only to outperform human pathologists, but also to have something to teach them about cancer biology.