Found in microbial communities around the world, Aspergillus fungi are pathogens, decomposers, and important sources of biotechnologically-important enzymes. Each Aspergillus species is known to contain more than 250 carbohydrate active enzymes (CAzymes), which break down plant cell walls and are of interest to Department of Energy (DOE) researchers working on the industrial production of sustainable alternative fuels using candidate bioenergy feedstock crops. Additionally, each fungal species is thought to contain more than 40 secondary metabolites, small molecules with the potential to act as biofuel and chemical intermediates.

In a study published the week of January 8, 2018 in the Proceedings of the National Academy of Sciences, a team led by researchers at the Technical University of Denmark (DTU), the DOE Joint Genome Institute (JGI), a DOE Office of Science User Facility, and the DOE's Joint BioEnergy Institute (JBEI), led by Lawrence Berkeley National Laboratory (Berkeley Lab), report the first results of a long-term plan to sequence, annotate and analyze the genomes of 300 Aspergillus fungi. These findings are a proof of concept of novel methods to functionally annotate genomes in order to more quickly identify genes of interest.

"This is the first outcome from the large-scale sequencing of 300+ Aspergillus species," said study co-author Igor Grigoriev, head of the JGI Fungal Genomics Program. "With the JGI's strategic shift towards functional genomics, this study illustrates several new approaches for functional annotation of genes. Many approaches rely on experiments and go gene by gene through individual genomes. Using Aspergillus, we're sequencing a lot of closely-related genomes to highlight and compare the differences between genomes. A comparative analysis of closely related species with distinct metabolic profiles may result in a relatively small number of species-specific secondary metabolism genes clusters to be mapped to a relatively small number of unique metabolites."

In the study, the team sequenced and annotated 6 Aspergillus species; 4 were sequenced using the Pacific Biosciences platform, producing very high quality genome assemblies that can serve as reference strains for future comparative genomics analyses. A comparative analysis involving these genomes and other Aspergillus genomes -- several of which were sequenced by the JGI -- was then conducted, and allowed the team to identify biosynthetic gene clusters for secondary metabolites of interest.

"One of the things we found to be interesting here was the diversity of the species we looked at -- we picked four that were distantly related," said study senior author Mikael R. Andersen, Professor at DTU. "With that diversity comes also chemical diversity, so we were able to find candidate genes for some very diverse types of compounds. This was based on a new analysis method that first author Inge Kjaerboelling developed. Moreover, we also showed how to solidify said predictions for a given compound by sequencing additional genomes of species known to produce the compound. By looking for genes found in all producer species, we can elegantly pinpoint the genes."

Study co-author Scott Baker, a fungal researcher at the Environmental Molecular Sciences Laboratory, a DOE Office of Science User Facility located at the Pacific Northwest National Laboratory, and a member of JBEI's Deconstruction Division, explained why finding candidate genes for diverse compounds matters. "The secondary metabolites are important because they represent such interesting and novel chemistry with regard to the biosynthesis of molecules that could be biofuels, biofuel precursors or bioproducts," he said. "While it is a significant effort to determine the structures of purified secondary metabolites, it is often relatively straightforward. However, connecting these molecules to their biosynthetic pathways can be quite challenging. We show that using comparative genomics can efficiently lead to reasonable predictions of gene clusters involved in biosynthetic pathways."

Grigoriev added that to date, about 30 Aspergillus genomes have been published, an additional 25 genomes are publicly available from the JGI fungal genomes portal Mycocosm, and over 100 genomes are being sequenced and analyzed.