In freshwater lakes, microbes regulate the flow of carbon and determine if the bodies of water serve as carbon sinks or carbon sources. Algae and cyanobacteria in particular can trap and use carbon, but their capacity to do so may be impacted by viruses. Viruses exist amidst all bacteria, usually in a 10-fold excess, and are made up of various sizes ranging from giant viruses, to much smaller viruses known as virophages (which live in giant viruses and use their machinery to replicate and spread.) Virophages can change the way a giant virus interacts with its host eukaryotic cell. For example, if algae are co-infected by a virophage and giant virus, the virophage limits the giant virus' ability to replicate efficiently. This reduces the impact a giant virus has on the diversion of nutrients, allowing the host algae to multiply, which could lead to more frequent algal blooms.

Using metagenome data sets collected over several years in northern freshwater lakes, a team led by researchers at The Ohio State University and the U.S. Department of Energy Joint Genome Institute (DOE JGI), a DOE Office of Science User Facility, uncovered 25 novel sequences of virophages. Reported October 11, 2017 in Nature Communications, the identification of these novel sequences effectively doubles the number of virophages known since their discovery a decade ago.

"Usually metagenome data sets are one-offs," said DOE JGI scientist and first author Simon Roux. "People had started to see virophages in metagenomes, but no one had a long time-series until now. Was it here once? Always? We never really knew this, but it's a critical piece of information to understand their importance."

The work stemmed from a Community Science Program (CSP) proposal involving northern freshwater lakes by KT (Trina) McMahon of the University of Wisconsin-Madison. Samples of microbial communities in Lake Mendota and Trout Bog Lake were regularly collected over several years as part of the NSF-funded North Temperate Lakes Long Term Ecological Research (NTL-LTER) project of the National Science Foundation. Sequencing and analyzing these metagenomes from the 3-year and the 5-year time-series is allowing researchers to identify the community members, determine their metabolic pathways, and follow changes in communities over several years.

Beyond looking at the microbial communities, McMahon and Rex Malmstrom, head of the DOE JGI Micro-Scale Applications group, asked collaborator Matt Sullivan at The Ohio State University if he'd be interested in using the same metagenomic data sets to look at the lakes' viral ecology. Roux started mining the data sets while still a postdoctoral fellow with the Sullivan lab. "I knew there were lots of viruses in the sequence data, but not that some the viruses were themselves hosts to other viruses," said Malmstrom. "With time series data we could do more than assemble genome and build phylogenetic trees, the data allowed us to examine genetic variation within populations and look for co-occurrence and abundance patterns between virophages and their giant virus hosts. With so many time points in the data set, you can find strong connections."

Trina McMahon, whose CSP datasets were the basis of this work, says having the viral ecology information helps form a more complete picture of the ecosystem. "We are thrilled to have one more piece of the puzzle. Viruses are clearly playing a major role in shaping community composition and therefore function, of the whole lake ecosystem. My own lab lacks the expertise to tackle viruses alone, hence the collaboration with Simon and Matt Sullivan is so important. Our long term goal is to learn enough about the forces controlling community assembly and dynamics, as well as the ecological traits of each lineage, in order to create more predictive models about how freshwater lakes will respond to climate and land-use change, at an ecosystem scale."

Aside from doubling the number of virophages in public databases, the time series allowed Roux and his colleagues to see the viruses' ecological profiles -- if factors such as the seasons or abundance of particular microbes influenced their own presence. Through co-occurrence analysis, the researchers associated the virophages with sequences of known lineages of giant viruses, and proposed the existence of 3 new groups of candidate giant viruses infected by virophages. These co-occurrence analyses also allowed them to find putative associations between the giant virus sequences and specific eukaryotic hosts.

"These findings are correlation-based," noted Roux, "but it's a good example of a metagenomics use case. Metagenomes helped us not only discover new viral diversity and determine what it should do in the ecosystem, but it helps us design hypothesis and follow-up experiments about virus-host interactions so we're not just throwing out a wide net blindly."