As a biomedical informatics researcher, Nigam Shah spends his days using math to try to make sense of giant, unwieldy data sets. He’s used data mining to identify off-label drug use and to create widgets to add to electronic health records that may help doctors learn about best practices for the treatment of a disease in the absence of a randomized controlled trial. In one of his current projects, he’s using machine learning, a field of study that has helped develop self-driving cars and speech recognition software, to try to identify people with an under-diagnosed and potentially deadly genetic disorder.

Shah and his Stanford University colleague Joshua Knowles, a cardiologist, are searching for people with familial hypercholesterolemia, a genetic disorder that causes high levels of LDL cholesterol, so called “bad cholesterol,” in the blood starting in utero. Men with the disorder have a 50 percent chance of having a heart attack by age 50; women have a 30 percent chance by age 60. With early treatment, however, those risks can be dramatically reduced. The problem is that most people who have FH don’t know it — studies estimate that only around 10 percent of people in the U.S. with the disorder are aware they have it. Even among those who are aware, the average age of diagnosis is 47, often after a patient has had a heart attack or other major cardiac event.

Could an algorithm succeed where doctors and guidelines have failed? That’s what the Stanford researchers are hoping to find out. Their work is part of a program run by the FH Foundation, a nonprofit group funding and promoting research on the disease. Cardiologists and genetics experts are excited about the research’s potential, though once the researchers successfully identify people with undiagnosed FH, patients could get caught up in a national controversy over drug prices.

Shah’s and Knowles’s branch of the project uses electronic health records. They started by identifying about 120 people known to have FH (true positives) from Stanford’s network of hospitals and doctors’ offices, and some people with high LDL who don’t have the genetic disorder (true negatives). Shah then began to train a computer to spot people with FH by letting it look through those patients’ files and to identify patterns in things like cholesterol levels, age, and the medicine patients were prescribed. The researchers then deployed this algorithm to look for undiagnosed FH within Stanford’s health records.

For another branch of the program, the FH Foundation acquired the medical billing and lab data of 89 million Americans with cardiovascular disease. Like the Stanford effort, it is trying to design an algorithm that can identify people who should be screened, but this time on a national scale. Researchers recently completed an early version of the algorithm, generating a map of where they think the incidence of FH is highest.

Both sets of researchers, however, are up against the false positive paradox. Even with a very good screening test, or in this case an algorithm, false positives can be much higher than true positives for a rare disease in a big population.

Kelly Myers, chief technology officer for the foundation, says they have decided to let the billing-data algorithm be more sensitive and less precise, which means allowing more false positives. He estimates that the algorithm’s current F1 score, a measure of precision, is about 0.6 (guessing at the flip of a coin would score 0.5) The rationale is that the people flagged by the algorithm will mostly have very high, uncontrolled LDL, even if they don’t turn out to have FH, so there isn’t a huge drawback (and there may even be a benefit) to flagging these patients for doctors.

Seth Martin, a cardiologist at Johns Hopkins who isn’t involved in the project says he would be happy to receive that kind of information. (Martin isn’t directly associated with the FH Foundation, though he is part of a group at Hopkins that recently joined the foundation’s efforts to build an FH registry.) He said that he went all the way through medical school without learning much about the genetic disorder; it wasn’t until his cardiology fellowship at Hopkins that he’d really heard of it. “There’s a lot of people out there, clinicians and patients, who this is going to be really staggering for,” Martin said, “to realize how common this is and how much of an impact it’s having on the health of our nation.”

There are all sorts of screening standards in the U.S. for both kids and adults that are supposed to catch people with FH, but they are rarely followed by doctors. Because FH is autosomal dominant — meaning that a parent with the disease has a 50 percent chance of passing it on to her children — screening family members of anyone diagnosed with the disorder has been one of the most useful and important ways of catching cases in other countries. (“We never find an individual with FH; we find a family with FH,” Knowles told me.) The problem is that it doesn’t regularly happen.

Even when the guidelines are followed, screening comes with its own set of complications. In the U.S., diagnosis is usually made through testing for abnormal lipid levels, since many parts of the country aren’t set up for appropriate genetic testing, which involves counseling and other ancillary services. Even if we could do genetic tests, we probably don’t know all of the genetic abnormalities that can lead to FH, said Muin Khoury, director of the Office of Public Health Genomics at the Centers for Disease Control and Prevention. “There are so many mutations and several genes, so epidemiologically, it’s really a challenge,” Khoury said. He added that adherence to screening guidelines has been so poor that we don’t even know how many people have the disease. Estimates put the number somewhere between 1 in 100 Americans and 1 in 500. Until we get better at screening, determining the actual number will be difficult.

Myers is careful to make clear that the algorithm they’re using is still in development, and feedback from initial results will go a long way to improving it. “We’re proud of our work, but we also know that several months from now we’ll be embarrassed by [this map] because the algorithm will be so much better then,” Myers said.

Still, it did accurately identify several known regions with high incidence of FH, including Amish communities in central and southern Pennsylvania and French Canadians in Louisiana. If the initiative works in the long term to identify people with FH, it could have big implications, helping them receive earlier and better treatment.

The foundation’s plan is to flag doctors who have a large number of patients who make good screening candidates and contact them. The patient data the foundation received doesn’t include personal information, so the researchers can’t see who the people are, but they can alert doctors that they have a large number of patients who should be screened for FH.

The CDC’s Khoury says the foundation’s efforts are one of the best examples of using genomics research to improve health at a population level. Genomics is most often thought of in the realm of precision medicine, which looks at a person’s genes, environment and lifestyle to try to understand disease in the context of her unique risk and experiences. In his work at the CDC, Khoury spends a lot of time thinking about how individual genetic markers can be used to improve health on a broader scale. At the moment, this often involves using data and registries to target resources at screening people who are at high risk for genetic diseases, similar to what the FH Foundation is doing.

There has been some criticism of the FH Foundation’s work. Because much of its funding comes from pharmaceutical companies with expensive and controversial new cholesterol drugs on the market, questions have been raised about whether pharma’s investment in the research to find FH patients is really a hunt for new customers.

Two drugs, Praluent from Sanofi and Repatha from Amgen, were recently approved by the U.S. Food and Drug Administration for use by people with FH or a history of strokes or heart attacks who can’t get their LDL levels low enough with previously existing drugs. People involved in early trials of these preventive drugs, including some patients with FH, saw their LDL levels drop by previously unheard of amounts, about 50 percent to 70 percent. That compares with 25 percent to 55 percent for statins, the drugs most commonly used to treat high cholesterol.

Enormous price tags have tempered excitement, however, with list prices for the drugs topping $14,000 per year, compared with hundreds of dollars for statins. The cost means new customers are a big win for the pharmaceutical industry — but complicates who will pay for them. When using pharmaceuticals for prevention, a lot of people take a drug while only a small number see a benefit. But with prices so high, insurance companies won’t be able to cover the cost of these drugs without raising the cost of insurance for everyone. (Another criticism: the FDA approved Praluent and Repatha based solely on data that shows they reduce LDL levels, a proxy for what doctors and patients really care about — reduced incidence of heart disease. We won’t know whether these new drugs reduce heart disease until preliminary studies become available later this year at the earliest.)

The FH Foundation’s founder, Katherine Wilemon, who has FH herself, said that pharmaceutical companies have been the only ones willing to fund the organization’s work to date but that the foundation is working on writing grant proposals to other funders. The organization’s reason for being, she said, is to help people with FH notice the disease early, partly in the hope that they won’t need these new medicines at all.

Knowles, chief medical officer for the FH Foundation, agreed that early diagnosis could keep the cost of care down. “The vast majority of patients will be able to get control of their disease with existing medicines,” in coordination with diet and exercise, he said. He acknowledged that finding more people with FH will be a benefit for doctors and drug companies, but said the people who will benefit most are patients and their families.

CORRECTION (Jan. 14, 5:41 p.m.): An earlier version of this story gave an incorrect date for the release of preliminary studies on the effects of Repatha on reducing heart disease. Amgen, the maker of Repatha, says it will release the information later this year, not in 2017.