[Like the Science Times page on Facebook. | Sign up for the Science Times newsletter.]

The science involves a search for third cousins. To identify a person through a DNA sample, an investigator uploads a previously analyzed genetic sequence to a database. The goal is to find someone who shares enough DNA to place them in the third cousin or closer range. Most of us have at least 800 people out there, somewhere in the world, who fall into this category. So long as one of these people is in a database, a skilled sleuth may be able to use other publicly available information to start building a family tree and figure out the person’s actual identity.

That technique has been used in recent months to identify more than 15 suspects in murder and sexual assault cases. The breakthroughs began in April with an arrest in the case of the Golden State Killer, who terrorized California with rapes and murders in the ’70s and ’80s. Other successes soon followed. A truck driver in Washington State was charged with the murder of a Canadian couple in 1987; a DJ in Pennsylvania was charged with the murder of a teacher in 1992.

Watching these developments, Dr. Erlich wondered about the odds of identifying any given person through cousins’ DNA in one of these databases.

His analysis is based not on the big genealogy databases such as 23andMe and Ancestry, but on two of the smallest: GEDmatch, which has around one million profiles, and MyHeritage, which had around 1.5 million at the time of the study. That’s because, for legal and logistical reasons, the larger sites cannot be easily used to identify anyone other than customers who mail in saliva.

But the smaller sites, set up to help genealogists maximize the odds of finding relatives, are more flexible. GEDmatch allows law-enforcement officials to search for genetic matches in its database in murder and sexual assault cases. MyHeritage does not, but it permits uploads from external labs. With both, it’s hard to be sure what’s being uploaded: grandma’s saliva, crime scene blood, a sample from a medical study or something else entirely.