Mario Tama/Getty

As the Zika virus raced across the Western Hemisphere in 2015 and 2016, geneticists eager to battle the outbreak felt crippled. The genome sequence of the Aedes aegypti mosquito that spreads Zika was incomplete and consisted of thousands of short DNA fragments, hampering research efforts.

With help from a new technique for stitching together genome sequences, scientists have finally ‘assembled’ the genome of A. aegypti as well as that of Culex quinquefasciatus, a mosquito that transmits West Nile virus. Their method, which was also used to construct a human genome with 99% accuracy, is described in Science on 23 March1. Each genome cost less than US$10,000.

Another team published a draft A. Aegypti genome in 20072 but struggled with its assembly. The new study places 94% of the genomes of the two mosquitoes on to three large chromosomes. “It would not have been possible to get a mosquito genome of this quality without this breakthrough,” says Leslie Vosshall, a mosquito researcher at Rockefeller University in New York City.

Some assembly required

Current sequencing technologies require DNA to be diced into short snippets. Because these snippets overlap, computers can piece them together to form continuous strings of letters. But in regions of the genome with a great deal of variation across a species, or long stretches of repetitive DNA, this trick doesn’t work. “It’s like a puzzle that’s missing a few pieces from the box,” says Daniel Neafsey, a population geneticist at the Broad Institute in Cambridge, Massachusetts.

This challenge can be overcome with time and money, as evidenced by the $2.7-billion Human Genome Project. But Erez Lieberman Aiden, a geneticist at Baylor College of Medicine in Houston, Texas, who led the new research, wanted a workaround. By looking at how chromosomes fold, Aiden and his colleagues created maps that show how frequently different stretches of the genome come into contact with one another, a method called ‘Hi-C’. Using these Hi-C maps as guides, researchers can infer the proximity of different genome fragments.

Scientists first showed that Hi-C can be used to guide genome assembly in 20133 and have since used the technique to assemble the genomes of several animal species4. These efforts relied on longer strings of letters, whereas the new method works on short sequences, lowering costs.

Ghost genes

David Severson, a mosquito researcher at the University of Notre Dame in Indiana who coordinated the initial A. aegypti genome project, calls the team’s effort “phenomenal”. He has long been frustrated by the lack of an assembled genome. “I’ve been waiting to work with something like this for probably twenty years,” he says. Knowing the location of individual genes, and their positions relative to one another, will help scientists to formulate new questions about how genes combine to influence traits.

But even this improved genome isn’t perfect: it omits millions of DNA letters, and some small stretches are likely in the wrong orientation. The Aedes Genome Working Group, which formed last year in the wake of the Zika outbreak, is working hard to construct an even more complete and accurate genome. The group, led by Vosshall, is coordinating with Aiden’s team.

These improved genomes will help mosquito researchers to study genes that were previously absent because they occurred in difficult-to-assemble regions of the genome. These genes were “like a ghost”, Neafsey says. “Even if you know functionally the gene should be there, you don’t have the means to study it.”