A tree's growth is dependent on nutrients from the soil and water, as well as the microbes in, on, and around the roots. Similarly, a human's health is shaped both by environmental factors and the body's interactions with the microbiome, particularly in the gut. Genome sequences are critical for characterizing individual microbes and understanding their functional roles. However, previous studies have estimated that only 50 percent of species in the gut microbiome have a sequenced genome, in part because many species have not yet been cultivated for study.

Published this week in Nature, researchers from the Department of Energy's (DOE) Lawrence Berkeley National Laboratory (Berkeley Lab), the Gladstone Institutes, and the Chan-Zuckerberg Biohub presented nearly 61,000 microbial genomes that were computationally reconstructed from 3,810 publicly available human gut metagenomes, which are datasets of all the genetic material present in a microbiome sample. The metagenome-assembled genomes (MAGs) included 2,058 previously unknown species, thereby bringing the number of known human gut species to 4,558 and increasing the phylogenetic diversity of sequenced gut bacteria by 50 percent.

A model community for large-scale culturing efforts

This work helps answer the question of why certain microbes have not been cultivated in the lab. Scientists have previously used metagenomics and single-cell genomics to discern the specific metabolic capabilities of uncultured microbes present in environmental samples. "However, many environmental communities are poorly studied, so it's not clear whether or not uncultivated organisms are really uncultivable," said Stephen Nayfach, a scientist in Berkeley Lab's Environmental Genomics and Systems Biology (EGSB) division and the study's first author. "The human gut, in contrast, is intensely studied with many large-scale culturing efforts, which suggests that the many of the 'wild,' uncultivated species in the human gut are difficult to culture using current approaches."

By comparing the reconstructed genomes of uncultivated species versus those that have been cultivated, the team found that uncultivated species' genomes are roughly 20 percent smaller, on average, and are missing numerous pathways for biosynthesis of fatty acids, amino acids, and vitamins. "Genes that are commonly missing from uncultivated gut bacteria may point towards important growth factors that have been overlooked in previous culture-based studies," Nayfach said.

Improving genomic resources for global populations

With the help of a new tool called IGGsearch, the team compared the microbiomes of people with 10 different diseases to those of healthy individuals and found that nearly 40 percent of microbe-disease associations involve a species that did not previously have a genome. "These disease links used to be invisible or hard to detect," said Katie Pollard, a senior investigator at the Gladstone Institutes and Biohub, and contributing author on the study.

One new species in the Negativicutes class, for example, was strongly depleted in people with the spinal inflammatory condition ankylosing spondylitis (AS). "As an AS patient, I am thrilled that we are finally gaining a more complete picture of how the microbiome changes in this disease," she added. Additionally, the team used IGGsearch microbiome profiles to build predictive models for disease and found that prediction accuracy was "significantly improved" compared to existing tools that primarily quantified the abundance of cultivated species.

Pollard, who is also a professor at UC San Francisco, added that, until now, microbiome genomic resources have been particularly sparse for individuals living outside North America, Europe, or China. "By assembling genomes from metagenomes of diverse people, we have helped to close this gap," she said.

Extending technologies across microbiome areas

EGSB senior scientist and team lead Nikos Kyrpides said that several of the computational methods and analyses Nayfach developed for this research are currently being used to enable one of the Joint Genome Institute (JGI) groundbreaking projects: analyzing a massive collection of other JGI-sequenced MAGs from diverse environments. He added that this type of analysis hinges on several critical factors: the availability of the microbiome data; the availability of the sequence data in public archives; and the lack of any data utilization restrictions from the community, as referenced in a recent Science policy paper on which he and Katie Pollard are co-authors.

For Kyrpides, the collaboration with Gladstone and CZ Biohub allowed his team to demonstrate the far reach of technologies developed across all microbiome areas. "This project is another excellent showcase that aggregated data from multiple studies which can enable us to address questions with far reaching implications that cannot be answered using any individual study alone," he said.