By Toni Clarke and Sharon Begley

WASHINGTON (Reuters) - The United States has proposed analyzing genetic information from more than 1 million American volunteers as part of a new initiative to understand human disease and develop medicines targeted to an individual's genetic make-up.

At the heart of the initiative, to be announced on Friday by President Barack Obama, is the creation of a pool of people - healthy and ill, men and women, old and young - who would be studied to learn how genetic variants affect health and disease.

Officials hope genetic data from several hundred thousand participants in ongoing genetic studies would be used, and other volunteers recruited to reach the 1 million total.

The near-term goal is to create more and better treatments for cancer, Dr. Francis Collins, director of the National Institutes of Health (NIH), told reporters on a conference call on Thursday. Longer term, he said, the project would provide information on how to individualize treatment for a range of diseases.

The initial focus on cancer, he said, is due partly to the lethality of the disease and partly because targeted medicine, known also as precision medicine, has made significant advances in cancer, although much more work is needed.

The president has proposed $215 million in his 2016 budget for the initiative. Of that, $130 million would go to the NIH to fund the research cohort and $70 million to NIH's National Cancer Institute to intensify efforts to identify molecular drivers of cancer and apply that knowledge to drug development.

A further $10 million would go to the Food and Drug Administration to develop databases on which to build an appropriate regulatory structure; $5 million would go to the Office of the National Coordinator for Health Information Technology to develop privacy standards and ensure the secure exchange of data.

The effort may raise alarm bells for privacy rights advocates who in the past have questioned the government's ability to guarantee that DNA information is kept anonymous. They have expressed fear participants may become identifiable or face discrimination.

SEQUENCING 1 MILLION GENOMES

The funding is not nearly enough to sequence 1 million genomes from scratch. Whole-genome sequencing, though plummeting in price, still costs about $1,000 per genome, Collins said, meaning this component alone would cost $1 billion.

Instead, he said, the national cohort would be assembled both from new volunteers interested in "an opportunity to take part in something historic," and existing cohorts that are already linking genomic data to medical outcomes.

The most ambitious of these is the Million Veteran Program, launched in 2011 by the Department of Veterans Affairs. Aimed at making genomic discoveries and bringing personalized medicine to veterans, it has enrolled more than 300,000 veterans and determined the DNA sequences of about 200,000.

The VA was a pioneer in electronic health records, which it will use to link the genotypes to vets' medical histories.

Academic centers have, with NIH funding, also amassed thousands of genomes and linked them to the risk of disease and other health outcomes. The Electronic Medical Records and Genomics Network, announced by NIH in 2007, aims to combine DNA information on more than 300,000 people and look for connections to diseases as varied as autism, appendicitis, cataracts, diabetes and dementia.

In 2014, Regeneron Pharmaceuticals Inc launched a collaboration with Pennsylvania-based Geisinger Health System to sequence the DNA of 100,000 Geisinger patients and, using their anonymous medical records, look for correlations between genes and disease. The company has finished 50,000 samples, spokeswoman Hala Mirza said.

Perhaps the most audacious effort is by the non-profit Human Longevity Inc, headed by Craig Venter. In 2013 it launched a project to sequence 1 million genomes by 2020. Privately funded, it will be made available to pharmaceutical companies such as Roche Holding AG, with which the institute has a research partnership.

"We're happy to work with them to help move the science," Venter said in an interview, referring to the administration's initiative.

But because of the many regulations surrounding medical privacy and human volunteers, he said, "we can't just mingle databases. It sounds like a naive assumption" if the White House expects existing cohorts to merge into its 1-million-genomes project.

Venter raced the government-funded Human Genome Project to a draw in 2000, sequencing the entire human genome using private funding in less time than it took the public effort.

ALTERING THE REGULATORY LANDSCAPE

Collins conceded that mingling the databases would be a challenge but insisted it is doable.

"It is something that can be achieved but obviously there is a lot that needs to be done," he said.

Collating, analyzing and applying all this data to the development of new drugs will require changes to how products are reviewed and approved by health regulators.

Dr. Margaret Hamburg, the FDA's commissioner, said on the conference call that the emerging field of precision medicine "presents a set of new issues for us at FDA." The agency is discussing new ways to approach the review process for personalized medicines and tests, she added.

(Reporting by Toni Clarke in Washington; Editing by Cynthia Osterman)