Here’s the ‘open-source’ genealogy DNA website that helped crack the Golden State Killer case

While frustrated California detectives looked fruitlessly for a man accused of brutally raping 45 women and killing a dozen more people, one of his relatives was likely busy seeking information to unite a big family tree.

Their colliding searches lead to the momentous arrest Tuesday of Joseph James DeAngelo, 72, the suspected Golden State Killer who eluded law enforcement for four decades.

The bulk of the information detectives used to help capture DeAngelo came from a no-frills “open-source” genealogy website called GEDmatch that allows users to voluntarily share their genetic profiles for free.

The Florida-based website that pools raw genetic profiles that people share publicly to find long-lost relatives was the team’s biggest tool, said lead investigator Paul Holes, a cold case expert and retired Contra Costa County District Attorney inspector who’s spent about seven years using such websites to identify the California killer.

The website revealed a “distant relative” early this year that helped lead authorities to DeAngelo, who was arrested at his Citrus Heights home on suspicion of conducting a reign of terror up and down California, including 12 homicides, 45 rapes and more than 100 residential burglaries between 1976 and 1986, investigators said.

That information made it possible for police to focus their search, supplemented by other clues, like ethnicity, height and where he lived at the time. Then they did surveillance and retrieved new evidence — such as saliva on a restaurant dish or a discarded beer can, cigarette or Kleenex – for a more direct DNA match, leading to his arrest.

The case sheds light on a little known fact: Even if we’ve never spit into a test tube, some of our genetic information may be public — and accessible to law enforcement. That’s because whenever one of our relatives — even distant, distant kin — submits their DNA to a public site hoping to find far-flung relations, some of our data is shared as well.

“When you put your information into a database voluntarily, and law enforcement has access to it, you may be unwittingly exposing your relatives — some you know, some you don’t know — to scrutiny by law enforcement. Even though they may have done nothing wrong,” said Andrea Roth, assistant professor of law at UC Berkeley Boalt School of Law and an expert on the use of forensic science in criminal trials.

And even companies that promise privacy may be required to surrender your genetic data if confronted with a search warrant.

“As more and more genetic information becomes available, its possible use in criminal investigations becomes more and more feasible, both for direct checking and familial checking,” said Stanford University School of Law professor and bioethicist Hank Greely, director of Stanford’s Center for Law and the Biosciences. “That’s true whether it’s a genealogy database, electronic health record or research.”

Holes began his search six years ago on Ysearch.org, a similar website as GEDmatch. Last March, Holes narrowed it down to an elderly man in Oregon, but he knew it was a “weak match,” he said.

Court records obtained by the Associated Press show that in March 2017 investigators in Clackamas County, Oregon, convinced a judge to order the 73-year-old man to provide a DNA sample. The documents said they used a genetic profile based off DNA from crime scenes linked to the serial killer and compared it to information from genealogical websites. The man turned out to be innocent.

“He was cooperative,” Holes said. “We generated the legal documentation more as a matter of routine, due diligence … But he was willing all along to provide his DNA.

“I knew it was a weak match. After six years it was our only match,” he said, adding that Golden State Killer and the Oregon man shared an unusual genetic trait.

After that failure, Hole said he “recalibrated” and used his own family member’s DNA profile from Ancestry to start learning how to use GEDmatch. He was confident the new technology would provide answers and he visited the agencies touched by the serial killer’s crimes to convince them to provide some DNA for his new tactic. Ventura County offered some DNA, but to submit DeAngelo’s data to the GEDmatch, investigators would have to put it in a format that would be recognized by the website and uploaded it to the site for analysis, said CeCe Moore, a genetic genealogist who has used the site in thousands of adoption cases.

The site returns a list of relatives — with names and emails — by degree of shared single nucleotide polymorphisms, or SNPs, which offer a genetic fingerprint. On average, you share 50 percent of your SNPs with a sibling, 25 percent with a half sibling, 12.5 percent with a first cousin, and 3.1 percent with a second cousin. The numbers diminish after that.

“They had luck, and they had skill,” said Moore, who owns a company called The DNA Detectives.

The hits on DeAngelo came from third and fourth cousins, Holes said. Family members directly related to DeAngelo’s great, great, great, great grandfather dating back to the 1800s when families would often have 15 kids. Holes and his team built out more than 25 different family trees. The tree that eventually linked to the Golden State Killer was so large, it contained about 1,000 people, he said.

“I’m not familiar with any other case used to capture someone like this,” Holes said.

California law limits such “familial searching” using the state’s criminal DNA database. And major companies, such as 23andMe and Ancestry.com, require a court order for law enforcement to access DNA files stored in their databases.

But there are no legal restrictions against accessing the 900,000 DNA files on the public GEDmatch database, created when consumers take information from commercial testing companies, such as 23andme, then upload it to the volunteer-run database.

That openness is what makes the site so useful, said Moore. Because so many people contribute data, vast family trees can be built.

“The majority of companies do a really good job of protecting the genetic privacy of customers,” said Moore. “But if a company allows the uploading of files from other sources, that opens the door up for other types of uses.”

On Friday, GEDmatch operator Curtis Rogers released a statement saying his company was unaware of the search for the East Area Rapist using his site.

“We understand that the GEDmatch database was used to help identify the Golden State Killer,” he said. “Although we were not approached by law enforcement or anyone else about this case or about the DNA, it has always been GEDmatch’s policy to inform users that the database could be used for other uses, as set forth in the Site Policy … While the database was created for genealogical research, it is important that GEDmatch participants understand the possible uses of their DNA, including identification of relatives that have committed crimes or were victims of crimes.”

Rogers continued to say if anyone had concerns that their profiles might be used for “non-genealogical uses” they should remove it from the site or not upload it in the first place. He provided an email address to delete profiles.

The site does not specifically address law enforcement searches, but it stresses that it “cannot guarantee that your information will never be accessed by unintended means.”

One of the earliest uses of this approach helped detectives reopen the killing of a 20-year-old woman in Wales. In 2002, data from a 14-year-old boy in Britain’s database of offenders led authorities to his uncle. In 2010, a DNA sample from the son of California’s “Grim Sleeper” serial killer Lonnie David Franklin Jr. led to his arrest and conviction. The son had been jailed on weapons charges.

The East Area Rapist/Golden State Killer case shows the promise and potential peril of DNA as a law enforcement tool, said UC Berkeley’s Roth.

“The promise is that it’s a wonderful thing that police were able to find this person,” she said. “And the potential concern is that the government is looking at a lot of genetic information that most of us probably want to keep private.

“And even though it is easy to think of this technology as something that is used just to track down serial killers,” she said, “If we allow the government to use it with no accountability or no further safeguards, then all of our genetic information might be at risk for being used for things we don’t want it to be used for.”

Staff writer Julia Prodis Sulek contributed to this report.

Share this: Print

View more on The Mercury News