Generally when cleaning out one’s freezer, it is advisable to get rid of any unidentifiable objects. Not so if one is a microbiologist. In that case, it is advisable to carefully label the specimen with as much information as possible about the environment from which it was collected.

It’s a good thing that the world's amateur microbiologists had freezers full of such specimens, because in 2010 the Earth Microbiome Project sent out a call (I think they shined a micrograph of a Staphylococcus aureus on the clouds or something). The call was for everyone to send in said specimens for a global analysis. And microbiologists from all seven continents, spanning forty-three countries and seventeen different environments, did just that.

The composition of microbial communities from environments ranging from the Sargasso Sea to our guts has already been studied. Trouble is, each sample type and region has been studied in isolation, making it difficult to extrapolate general rules or patterns as to what may dictate the composition of each community. Findings have been reported on the effects of temperature, pH, salinity, oxygen levels, and even day length on microbial community composition, but they cannot be globally applied because the samples were analyzed by different people at different times in different places in different ways.

The Earth Microbiome Project, undertaken in 2010, tried to impose order on the chaos of different studies by subjecting all samples to the same protocol. They got 27,751 samples of bacteria and archaea from ninety-seven studies and just published a meta-analysis of their archive in Nature. The team standardized the way the samples were collected and the way the DNA was extracted and transported, and they sequenced all of the DNA and performed the data analysis in the same lab. The data trove they assembled can be used as a resource, and their standardized protocols should help ensure that newly generated data can be easily incorporated.

Studies delineating the composition of microbial communities have traditionally done so by sequencing how a gene contributes to the production of proteins (it's called the 16s rRNA). The gene is present in all species but has some regions that tolerate variations. The right sequence has been used as a synecdoche, indicating the presence of a particular species in a sample.

The Earth Microbiome team used these 16s rRNA sequences—about a third of which had never been seen before—to determine that microbes do indeed cluster by environment. Samples collected in each type of environment, no matter where that environment is on the globe, are more similar to each other than to those collected from a different environment. That held true regardless of who collected the sample and how.

The team divided the samples roughly into hot (environment) or not; host-associated microbes were then split into those that thrived on animals or on plants, and free-living microbes got split into those that preferred saline environments or non-saline. Further gradations were made from there: animal gut vs. skin, non-saline water vs. soil, etc.

This database will allow researchers to make a good guess as to where a given microbe might live given its sequence, which can be useful in forensics. It can also be used to test ecological principles; it already debunked the idea that microbial richness (the diversity of species in a given environment) increases with temperature, showing instead that richness peaks at a quite narrow temperature range and then drops.

There is a downside in the choice to treat all the samples the same way—not all microbes will like it, and they'll end up left out of the survey. Results are thus skewed toward including only those species that can tolerate being handled in this manner. But the researchers decided that consistency is valuable enough that the sacrifice is worth it. And yes, they did make trading cards. But since you can download them all, there's no point in actually trading them.

Nature, 2017. DOI: 10.1038/nature24621 (About DOIs).