An international team of scientists from eleven institutions in the U.S. has produced a first draft of the ‘tree of life’ for 2.34 million species of animals, plants, fungi and microbes.

Thousands of trees that describe the evolutionary history of animals, plants, and microbes have been published over the years. The new study, published in the Proceedings of the National Academy of Sciences, is the first to apply an efficient and automated process for assembling small trees into a complete ‘tree of life.’

The tree depicts the relationships among living things as they diverged from one another over time, tracing back to the beginning of life on Earth more than 3.5 billion years ago.

“This is the first real attempt to connect the dots and put it all together. Think of it as version 1.0,” said senior author Dr Karen Cranston of Duke University.

Rather than build the tree from scratch, Dr Cranston and co-authors pieced it together by compiling thousands of smaller chunks that had already been published online.

The initial draft is based on 484 smaller trees from previously published studies.

“Many participants on the project contributed hundreds of hours tracking down and cleaning up thousands of trees from the literature, then selecting 484 of them that were used to generate the draft tree of life,” said lead author Dr Cody Hinchliff of the University of Michigan.

To map trees from different sources to the branches and twigs of a single supertree, one of the biggest challenges was simply accounting for the name changes, alternate names, common misspellings and abbreviations for each species.

“A survey of more than 7,500 phylogenetic studies published between 2000 and 2012 in more than 100 journals found that only one out of six studies had deposited their data in a digital, downloadable format that the researchers could use,” the scientists said.

“The vast majority of evolutionary trees are published as PDFs and other image files that are impossible to enter into a database or merge with other trees.”

Dr Cranston said: “there’s a pretty big gap between the sum of what scientists know about how living things are related, and what’s actually available digitally.”

As a result, the relationships depicted in some parts of the tree, such as the branches representing the pea and sunflower families, don’t always agree with expert opinion. Other parts of the tree, particularly insects and microbes, remain elusive.

That’s because even the most popular online archive of raw genetic sequences contains DNA data for less than 5% of the tens of millions species estimated to exist on Earth.

To help fill in the gaps, Dr Cranston and co-authors are also developing software that will enable scientists to update and revise the tree as new data come in for the millions of species still being named or discovered.

“It’s by no means finished. It’s critically important to share data for already-published and newly-published work if we want to improve the tree,” Dr Cranston said.

“Twenty five years ago people said this goal of huge trees was impossible. The Open Tree of Life is an important starting point that other investigators can now refine and improve for decades to come,” said co-author Dr Douglas Soltis of the University of Florida.

“This is just the beginning. While the tree of life is interesting in its own right, our database of thousands of curated trees is an even more useful resource,” added co-author Dr Stephen Smith of the University of Michigan.

“We hope that this publication will encourage other researchers to contribute their own studies or to enter information from previously published sources.”

The current version of the Open Tree of Life is available to browse and download at https://tree.opentreeoﬂife.org.

_____

Cody E. Hinchliff et al. Synthesis of phylogeny and taxonomy into a comprehensive tree of life. PNAS, published online September 18, 2015; doi: 10.1073/pnas.1423041112