Often a cancer’s grade and stage can be used to predict how the patient will fare. They also can help clinicians decide how, and how aggressively, to treat the disease. This classification system doesn’t always work well for lung cancer, however. In particular, the lung cancer subtypes of adenocarcinoma and squamous cell carcinoma can be difficult to tell apart when examining tissue culture slides. Furthermore, the stage and grade of a patient’s cancer doesn’t always correlate with their prognosis, which can vary widely. Fifty percent of stage-1 adenocarcinoma patients, for example, die within five years of their diagnosis, while about 15 percent survive more than 10 years.

The researchers used 2,186 images from a national database called The Cancer Genome Atlas obtained from patients with either adenocarcinoma or squamous cell carcinoma. The database also contained information about the grade and stage assigned to each cancer and how long each patient lived after diagnosis.

The researchers then used the images to “train” a computer software program to identify many more cancer-specific characteristics than can be detected by the human eye — nearly 10,000 individual traits, versus the several hundred usually assessed by pathologists. These characteristics included not just cell size and shape, but also the shape and texture of the cells’ nuclei and the spatial relations among neighboring tumor cells.

“We began the study without any preconceived ideas, and we let the software determine which characteristics are important,” said Snyder, who is the Stanford W. Ascherman, MD, FACS, Professor in Genetics. “In hindsight, everything makes sense. And the computers can assess even tiny differences across thousands of samples many times more accurately and rapidly than a human.”

Bringing pathology into the 21st century

The researchers homed in on a subset of cellular characteristics identified by the software that could best be used to differentiate tumor cells from the surrounding noncancerous tissue, identify the cancer subtype, and predict how long each patient would survive after diagnosis. They then validated the ability of the software to accurately distinguish short-term survivors from those who lived significantly longer on another dataset of 294 lung cancer patients from the Stanford Tissue Microarray Database.

This brings cancer pathology into the 21st century and has the potential to be an awesome thing for patients and their clinicians.

Identifying previously unknown physical characteristics that can predict cancer severity and survival times is also likely to lead to greater understanding of the molecular processes of cancer initiation and progression. In particular, Snyder anticipates that the machine-learning system described in this study will be able to complement the emerging fields of cancer genomics, transcriptomics and proteomics. Cancer researchers in these fields study the DNA mutations and the gene and protein expression patterns that lead to disease.

“We launched this study because we wanted to begin marrying imaging to our ‘omics’ studies to better understand cancer processes at a molecular level,” Snyder said. “This brings cancer pathology into the 21st century and has the potential to be an awesome thing for patients and their clinicians.”

The work is an example of Stanford Medicine’s focus on precision health, the goal of which is to anticipate and prevent disease in the healthy and precisely diagnose and treat disease in the ill.

Stanford co-authors of the study are former postdoctoral scholar Ce Zhang, PhD; professor of pathology Gerald Berry, MD; professor of bioengineering, of genetics and of medicine Russ Altman, MD, PhD; and assistant professor of computer science Christopher Re, PhD.

The study was supported by the National Cancer Institute and the National Institutes of Health (grants U01CA142555 and 5U24CA160036-05).

Stanford’s Department of Genetics also supported the work.