Previous studies based on microsatellite analyses and karyotyping have shown that the marbled crayfish is a triploid organism with 276 chromosomes, which corresponds to the exact triplicate number of the haploid set of chromosomes in P. fallax24. Furthermore, marbled crayfish represent an evolutionarily very young species8,24, which contrasts with other known parthenogenetic animals, such as bdelloid rotifers and asexually reproducing nematodes, and suggests that the three genome copies are still highly similar. We therefore assumed that the marbled crayfish genome represents a triplicate version of the original 1 N genotype from P. fallax. To quantitatively determine the marbled crayfish genome size, we analysed the DNA content of haemocytes by flow cytometry (Fig. 1b). Haploid genome size estimates using human and mouse blood cells as internal references suggested genome sizes of 3.9 and 3.5 gigabase pairs (Gbp), respectively (Fig. 1c and Supplementary Fig. 1). An in silico genome size estimate based on k-mer frequencies provided a slightly lower, but overall consistent value (3.3 Gb; Supplementary Fig. 1). Taken together, these findings suggest that the 1 N-equivalent genome size of the marbled crayfish is approximately 3.5 Gbp.

To establish the complete genome sequence of the marbled crayfish, we used genomic DNA from a single animal of the ‘Petshop’ laboratory strain8 to prepare various libraries for Illumina sequencing and obtained 350 Gbp of DNA sequence (Supplementary Table 1). After contig assembly, scaffolding and gap filling, we generated a draft genome sequence with a total length of 3.3 Gb and a weighted mean sequence length (N50) of 39.4 kilobases (kb). Benchmarking with universal single-copy orthologues25 suggested that the quality of the marbled crayfish genome assembly was comparable to other, recently published arthropod genomes26,27,28 (Supplementary Fig. 2). Phylogenetic placement among various published arthropod genomes confirmed that the marbled crayfish is most closely related to D. pulex and P. hawaiensis (Fig. 1d).

The gene length in marbled crayfish averaged 6.7 kb. Average exon and intron sizes were 0.3 kb and 2 kb, respectively, thus placing marbled crayfish gene lengths between those of P. hawaiensis and D. pulex. Gene annotation was performed using the MAKER genome annotation pipeline29, which provided important starting points for further analysis. For example, we detected multiple genomic locations for cellulase genes of the GH9 family (Fig. 2a). Genomically encoded cellulase genes are relatively rare in higher animals, but are generally assumed to play a key role for omnivorousness in freshwater crayfish30. Furthermore, repeat annotation detected 484,313 repeats that were subclassified into 7 major categories and covered 8.8% of the annotated genome assembly (Fig. 2b). Repeat coverage is likely to increase substantially in future versions of the marbled crayfish genome assembly, as the fragmentation of the genome assembly currently represents a major bottleneck for algorithmic repeat detection. The most recent version of the genome assembly and annotation can be accessed through a dedicated internet portal (http://marmorkrebs.dkfz.de).

Fig. 2: Annotation of the marbled crayfish genome. a, Gene structure of the marbled crayfish GH9 cellulase gene and coding sequence (CDS). b, Automatically annotated repeats (n = 520,403) distributed into seven major categories. LINEs/SINEs, long/short interspersed nuclear elements; LTRs, long terminal repeats. c, Numbers of transcripts found within one or multiple different functional annotation databases (db). d, Transcripts classified by lineage based on sequence homology. Full size image

In parallel, we also established the marbled crayfish transcriptome from a normalized sequencing library that was generated from several distinct tissues. Benchmarking again confirmed that the quality was comparable to other, recently published arthropod transcriptomes (Supplementary Fig. 2). The transcriptome consists of 22,338 transcripts (Fig. 2c), which corresponds closely to the numbers of predicted genes (21,772) and messenger RNAs (22,205) in the genome assembly. Comparisons with other publicly available transcriptomes revealed homologues for the majority (81%) of predicted proteins, while 19% (n = 4,306) of the predicted proteins were classified as unique (Fig. 2d).

The analysis of two other parthenogenetic genomes had revealed the presence of homologous but diverged blocks that reflect genome rearrangements typically associated with asexual reproduction19,23. However, the average copy number of the 1,066 universal single-copy orthologues was 1.01, which argues against the existence of divergent homologues. Similarly, the coverage distribution of genes showed only a single peak (Fig. 3a). Finally, we could only detect a very low number (n = 66) of collinear genes on different scaffolds, all of which had rather high E-values (Supplementary Table 2), indicating that they probably represent artefacts. Together, these findings suggest that the genome rearrangements described in longstanding parthenogens are not detectable in the marbled crayfish genome, which is consistent with its very young evolutionary age.

Fig. 3: Characterization of the marbled crayfish genome. a, Read coverage depth in genes extracted from the marbled crayfish genome sequence. Only a single peak can be observed, indicating the absence of divergent homologues. b, Genome-wide heterozygosity in marbled crayfish compared with other organisms. c, Violin plots illustrating the combined frequency of alternative nucleotides in all types of biallelic variants among the three different Procambarus species. d, Combined distribution of biallelic (orange) and triallelic (purple) variants in marbled crayfish, P. fallax and P. alleni. Full size image

Additional features of the marbled crayfish genome were revealed by the analysis of heterozygous sequence variants. The global rate of heterozygosity was 0.53%, which is relatively high compared with other sequenced genomes, including P. fallax from the pet trade (0.03%; Fig. 3b). Furthermore, the allelic frequency of marbled crayfish sequence variants peaked at 0.33 (Fig. 3c), in agreement with heterozygous positions in a triploid genome. In contrast, the frequencies of P. fallax and P. alleni sequence variants and polymorphisms peaked at 0.5 and 1.0 (Fig. 3c), reflecting their diploid nature and polymorphisms towards the marbled crayfish genome, respectively. Finally, the triploid marbled crayfish genome showed a negligible fraction (0.15%) of triallelic sequence polymorphisms (Fig. 3d). Together, these findings provide strong support for an AA’B genotype that may have originated from an autopolyploid gamete8.

Previous studies have suggested that marbled crayfish reproduce by apomictic parthenogenesis31,32, which should result in the establishment of a genetically homogeneous population. We therefore sequenced the genomes of four additional marbled crayfish from diverse sources: (1) an animal from the longest-known stock (‘Heidelberg’, founded in 1995); (2) an animal from a German wild population caught in 2013 (‘Moosweiher’); (3) an animal from a market purchase in Madagascar (‘Madagascar 1’); and (4) an animal from an American laboratory stock, which originated from another pet shop purchase in Germany17 (‘Petshop 2’). In addition, we generated genome sequences from two closely related species, P. fallax (four animals from an aquarium supplier) and P. alleni (one animal from an aquarium supplier). Sequencing and mapping to the marbled crayfish reference genome resulted in genome coverages ranging from 16–72× (Supplementary Table 3). We then used single-nucleotide polymorphisms to analyse phylogenetic relationships among the nine sequenced animals. The results confirmed the clonality of the four analysed marbled crayfish genomes and their separation from P. fallax and P. alleni (Fig. 4). Taken together, our findings illustrate a unique path of animal genome evolution that involves genome duplication, triploidy and clonal expansion.

Fig. 4: Phylogenetic relationships among marbled crayfish and closely related freshwater crayfish species. Phylogenetic tree based on the SNVs of five marbled crayfish, four P. fallax and one P. alleni genome sequences. The five marbled crayfish genomes are too similar to become resolved into visible individual branches and are therefore shown as a single branch. Full size image

Despite their clonality and very recent emergence, it has been suggested that marbled crayfish are successful invaders of new territories and environments9,11,18. For example, in 2007, a novel crayfish was found in the capital of Madagascar. The animals were characterized as marbled crayfish based on their morphology and DNA sequencing of a 16 S mitochondrial DNA fragment9,10. However, there were no further reports on their genetic characteristics and potential spread. The availability of genome sequences for the closely related and morphologically similar marbled crayfish P. fallax and P. alleni allowed us to identify genomic sites with a high degree of sequence diversity among the three species, to confirm the identity and track the spread of the animals on Madagascar. Fieldwork was conducted in two phases, with a first series of collections in the central highland, followed by a more comprehensive study covering large parts of the country (Supplementary Table 4). In a pilot analysis, we sequenced polymerase chain reaction amplicons for a mitochondrial (cytochrome b; 214 bp) and nuclear (Dnmt1; 220 bp) locus from 24 independent animals that were collected in four regions from the central highland (Fig. 5a). The results showed 100% sequence identity with the marbled crayfish reference sequence for all analysed samples and substantial sequence differences towards P. fallax and P. alleni (Fig. 5a). These findings unambiguously classify the collected animals as marbled crayfish. Additionally, our systematic field collections and morphological analyses detected large populations of marbled crayfish (Fig. 5b) in diverse freshwater habitats, such as lakes and rice fields on the central highland, as well as in swamps close to the coastline (Supplementary Table 4 and Fig. 5c). We therefore analysed an additional 25 animals from 8 diverse regions by DNA sequencing of cytochrome b and Dnmt1 and identified only 6 mismatches among >20,000 bases analysed in total (Supplementary Table 5), which is commensurate to the normal level of polymerase chain reaction and/or sequencing errors. We estimate that between 2007 and 2017, the size of the marbled crayfish distribution area increased about 100-fold from 103 km2 to more than 105 km2 (Fig. 5c) and that the current population on Madagascar comprises millions of animals.

Fig. 5: Invasive spread of marbled crayfish on Madagascar. a, Representative genotyping results of 24 animals collected on four distinct collection sites. The heatmap indicates sequence similarities with the marbled crayfish reference sequence. P. fallax and P. alleni were included as controls. Cyt b, cytochrome b; SNPs, single-nucleotide polymorphisms. b, Marbled crayfish by-catch in a traditional fishing tool or 'tandroho'. c, Distribution of marbled crayfish on Madagascar (as of March 2017) in major biogeographical zones55, as indicated. Red dots indicate discovery sites where the presence of marbled crayfish was confirmed by DNA sequencing. White dots indicate sites where no marbled crayfish were found. The small central yellow circle indicates the distribution range reported for the year 2007 (ref. 9). Full size image

To further characterize the marbled crayfish population on Madagascar, we used whole-genome sequencing. The sequencing of 5 animals from diverse collection sites and mapping to the marbled crayfish reference genome resulted in genome coverages ranging from 17× to 36× (Supplementary Table 3). Sequence comparisons revealed extremely low numbers of polymorphisms in the analysed marbled crayfish genomes (Fig. 6a). In marked contrast, the P. fallax genome showed a substantial number of polymorphisms towards the marbled crayfish reference genome sequence (Fig. 6a). These results provide additional, strong support for the clonality of the marbled crayfish population.

Fig. 6: Clonality of the marbled crayfish population. a, Schematic overview of sequence polymorphisms of marbled crayfish from Madagascar. The plot consists of eight segments representing eight arbitrarily chosen genomic scaffolds (scaffold lengths are indicated) in concentric rings, representing different animals. Vertical lines represent polymorphic positions to the reference genome. The genome sequences of P. fallax (green) and the marbled crayfish reference genome sequence (dark red) are shown for comparison. kbp, kilobase pairs. b, Phylogenetic tree of 11 marbled crayfish from diverse sources, as determined by the distribution of the 416 SNVs detected in the population. Names shown in red indicate animals from Madagascar, while mustard yellow indicates animals from Germany. Animals originating from a German pet shop chain are shown in purple. Full size image

To further explore the relationship between the animals found on Madagascar and the German stocks of marbled crayfish, we obtained two additional whole-genome sequences of animals from Germany (Supplementary Table 3). Our final dataset thus consisted of 11 genome sequences from diverse sources (Supplementary Table 6). Genetic variants were extracted and filtered for single base substitutions and mapping artefacts were eliminated by remapping of sequencing reads from the genome reference individual (see Methods for details). This identified a strikingly low number of only 416 single-nucleotide variants (SNVs) in a highly diverse group of animals (Supplementary Table 7). The maximum number of non-synonymous SNVs per animal was four (Supplementary Table 7), which further illustrates the extremely low genetic complexity of the marbled crayfish population. Finally, the comparison of SNVs also provided interesting insight into the relationships of the sequenced animals. The results showed an overlapping distribution of animals from Germany and Madagascar (Fig. 6b), indicating that the Malagasy population originates from a German stock. In addition, a separate cluster was formed by two aquarium stocks that were independently founded by animals from different stores of the same German pet shop chain more than ten years ago (Fig. 6b).

In summary, our findings thus establish the marbled crayfish as a potent invader of freshwater ecosystems and demonstrate a unique genetic structure of the invasive population.