Pharmaceutical giant AstraZeneca today unveiled a ten-year project to sequence the genomes of two million people — with top-tier medical and tech partnerships and a nine-figure price tag. The resulting database would be the largest of its kind, and would be used to hunt rare genes that may contribute to diseases.

AstraZeneca is dipping into its own stores for the first big source of data: 500,000 patients have participated in its clinical trials, and their biological materials will furnish the first quarter of the goal amount.

The actual sequencing will be done by Human Longevity Inc, a genomics company started by Craig Venter that has raised $300 million since 2014. HLI already has about 26,000 genomes, and has stated that it plans to get to a million by 2020.

The University of Helsinki will be contributing genomes as well, but also medical records for Finland’s geographically-isolated population. The idea is that interesting or rare genes shuffled or lost over centuries of humanity’s travel and intermarriage may still be present there.

Scientific teams where the reams of data will be sifted through for correlations and drug targets will be established at AstraZeneca itself as well as at the Wellcome Trust’s Sanger Institute in Hinxton, U.K.

Although the full cost of the endeavor was not disclosed, AstraZeneca’s EVP of innovative medicines told Nature that it was in the “hundreds of millions,” and Science reported Venter’s comment that it had taken over a year of negotiation to settle.