International Vertebrate Genomes Project releases new genomes

Max Planck Society supports projects for high quality reference genomes of animals

The International Vertebrate Genomes Project (VGP) is officially launched and releases 15 new reference genomes representing all five vertebrate classes – mammals, birds, reptiles, amphibians, and fish. These 15 genomes are the most complete versions of their species to date. The mission of the VGP is to sequence and assemble high quality, nearly error-free, and complete genomes of all 66,000 vertebrate species on Earth. The VGP data is currently being produced primarily by teams at three sequencing hubs: the Rockefeller University, USA, the Wellcome Sanger Institute, UK, and at the Max Planck Institute of Molecular Cell Biology and Genetics in Dresden, Germany. Two of the 15 released genomes, a bat and a fish, have been sequenced and assembled in Dresden.

The common bent-wing bat (Miniopterus schreibersii) is a species of subtropical origin distributed throughout the southern Palearctic, Ethiopic, Oriental, and Australian regions. © Dietmar Nill The common bent-wing bat (Miniopterus schreibersii) is a species of subtropical origin distributed throughout the southern Palearctic, Ethiopic, Oriental, and Australian regions. © Dietmar Nill

Spix's disk-winged bat (Thyroptera tricolor), one of the vertebrate species whose genome scientists plan to analyze. © S. Puechmaille Spix's disk-winged bat (Thyroptera tricolor), one of the vertebrate species whose genome scientists plan to analyze. © S. Puechmaille

With its ambitious mission the VGP aims to address fundamental questions in biology, conservation, and disease including identifying species most genetically at risk for extinction and preserving their genetic information for future generations. The high-quality VGP genomes will become the main references for their species and will be stored in the Genome Ark, a digital open-access library of genomes.

The current Phase 1 of the VGP – the VGP orders project - aims to create reference assemblies of selected species representing all 260 vertebrate orders that have diverged from each other shortly after the last mass extinction 66 million years ago. Studying these ordinal level species will help scientists determine what type of species survived the previous extinction event that wiped out the dinosaurs. Those studies can also give insights into how other species could survive the current 6th mass extinction event, help identify genetic variants that might protect these species from total extinction, and preserve biodiversity. Amongst the 15 new genomes are critically endangered species like the platypus, and the Kakapo parrot. Other species include the zebra finch songbird and the Anna’s hummingbird, which like parrots, belong to the only three vocal learning bird orders among over 40 orders of birds. Also, two vocal learning bat species are part of this first data release.

To conduct the VGP, the umbrella G10K organization, from which the project arose, has convened over 150 experts from academia, industry, and government, from 12 countries, to develop high-resolution sequencing methods that both reduce costs and eliminate the errors that plague current reference genomes. Many current reference genomes are riddled with errors—parts of genes are missing, some are incorrectly assembled, and other genes are completely missing. Consequently, researchers are potentially working with incorrect gene sequences and structures hampering their genomic studies. The new VGP genomes eliminate most of these errors.

Genome analysis of bats and fish

The Max Planck Institute of Molecular Cell Biology and Genetics and in particular its bioinformatics researchers at the Center for Systems Biology Dresden (CSBD) is involved in the sequencing, assembly and annotation of the initial Phase I genomes of the VGP project with a focus on bats and fish. The Dresden scientists are part of the DRESDEN-concept Genome Center (DCGC) and have special expertise in using various long-read sequencing and long-range scaffolding technologies. The Dresden hub, led by Eugene Myers has contributed two genomes of the 15 released genomes: the greater horseshoe bat (Rhinolophus ferrumequinum) and the flier cichlid fish (Archocentrus centrarchus). In the future, about ten to 20 percent of the VGP species are expected to be sequenced in Dresden. Eugene Myers, director at the Dresden Max Planck Institute and founder of the CSBD says, “The advances in long-read sequencing is revolutionizing DNA sequencing. After a ten-year hiatus, this trend inspired me to return to genome assembly as I believe it implies that we will ultimately be able to produce near-perfect genome reconstructions. I think this capability is going to dramatically alter the landscape of genomics.”

The Vertebrate Genomes Project Two of the 15 released genomes, a bat and a fish, have been sequenced and assembled at the Max Planck Institute of Molecular Cell Biology and Genetics (MPI-CBG) in Dresden. Gene Myers on the huge project

In addition to the VGP, the Max Planck Institute of Molecular Cell Biology and Genetics and the CSBD are actively engaged in synergistic international sequencing projects. The Bat1K project has the goal of sequencing all 1,300 bat species, many of which live unusually long or have near-perfect immune systems. Six bat genomes will be released in the near future, and another 25 species are being prepared to study aging, immunity, and vocal-learning in collaboration with the Bat1K consortium, which includes partners Sonja Vernes from the Max Planck Institute for Psycholinguistics in the Netherlands and Emma Teeling of the University College Dublin, UK. Another project is the Euro-Fish project, which aims to sequence almost all 600 species of fish swimming in European fresh waters. One of our main collaborators is Axel Meyer of the University of Konstanz. The Max Planck Society is funding the initial genomes from these synergistic projects. All the genomes will be sequenced to the high quality standard set by the VGP and will be placed in the Genome Ark repository, where one day all 66,000 vertebrates will be recorded.

The 15 new genomes

1. Mammals (4 species)

Two bat species, Greater horseshoe bat (Rhinolophus ferrumequinum) and Pale spear-nose bat (Phyllostomus discolor), used as models for longevity and vocal learning

The Canada lynx (Lynx canadensis), once nearly extinct in the United States and now recovering

The duck-billed platypus (Ornithorhynchus anatinus), an egg-laying mammal with reptilian traits

2. Reptiles (1 species)

A newly discovered turtle species from Mexico, Goode's Thornscrub Tortoise (Gopherus evgoodei)

3. Amphibians (1 species)

Two-lined caecilian (Rhinatrema bivittatum), a limbless amphibian that resembles a snake

4. Birds (3 species, 4 genomes)

In addition to the kakapo (Strigops habroptilus), the VGP re-sequenced species from two other bird orders to represent the only three vocal learning birds among more than 40 avian orders

A male and female zebra finch (Taeniopygia guttata), the most commonly studied vocal learner

Anna’s hummingbird (Calypte anna), belonging to the smallest group of birds

5. Fish (5 species representing a large diversity of traits and are used to study species evolution and adaptation):