We assembled 2,350,959,615 base pairs (bp) of the narwhal nuclear genome (2,156,711,577 bp excluding missing data [N's]) with a scaffold N50 of 1,483,363 bp and contig N50 of 10,481 bp, in a total of 21,007 scaffolds. We used ∼100× coverage of multiple short insert and ∼32× of mate paired Illumina libraries and in silico mate paired libraries () constructed using the beluga reference genome () (Genbank: GCA_002288925.2) ( Table S1 ). Investigations into the completeness of the assembly using Benchmarking Universal Single-Copy Orthologs (BUSCO) analyses and the mammalian BUSCO gene set showed a high level of complete BUSCO scores (93%) ( Table S2 ), indicating a fairly complete and high-quality genome. Repeat profiling results showed that our assembled genome consists of 37.9% repetitive elements ( Table S3 ). We identified a total of 21,785 protein-coding genes through genome annotations with MAKER2 (). These protein-coding genes spanned a total of ∼612.8 Mb of the assembled narwhal genome, with exons accounting for ∼31.5 Mb.

To investigate whether the narwhal exhibited higher diversity in regions of putatively higher selective pressure, as opposed to putatively neutral regions, we compared heterozygosity in protein-coding regions (exons) versus gene regions (introns plus exons) and non-coding flanking regions 10, 20, and 50 kb up- and downstream of protein-coding genes ( Figure 3 Table S5 ). To elucidate whether this pattern is unique to the narwhal lineage, we ran the same suite of analyses on the beluga, its sister species ( Figure 3 Table S6 ). The significance of the differences detected in these regions was tested using an unpaired two-sample t test. Significant differences in average heterozygosity between the exons and all other regions tested were found in both the narwhal ( Table S7 ) and the beluga ( Table S8 ). This finding is as expected, as coding regions are more conserved than non-coding regions, due to higher selective pressures. However, even though the p scores suggested significant differences in coding regions compared with non-coding regions in both the narwhal and the beluga, we note that the t scores uncovered in the narwhal are much lower than those found while comparing the same genomic regions in the beluga. This may imply a level of retention of diversity in coding regions relative to the non-coding regions. Such retained diversity could help to maintain some adaptive potential in the narwhal, enabling the species to adapt to changes in its environment, despite low genome-wide diversity levels. The overall low levels of heterozygosity and lack of variation in diversity levels between coding and non-coding regions across the narwhal genome, relative to the beluga, may suggest that heterozygosity levels are at a diversity stasis across the genome. Hence, in the narwhal, any further decrease in genomic diversity could lead to survivability problems.

Inbreeding has been shown to be a major driver in the loss of genetic diversity following population bottlenecks (), and is a well-known factor in the reduction of survivability and reproductive success in naturally outbreeding species (). We investigated the 500-kb sliding windows for any indications of inbreeding. We find that only 0.03% of the 500-kb windows lack heterozygosity, indicating that inbreeding is most likely not the cause of low heterozygosity in the narwhal. To further validate this, we manually investigated for runs of homozygosity across the five largest scaffolds and found no contiguous stretches of homozygosity ( Figure S2 ). Thus, it appears that, even though the genetic diversity of the narwhal is low, it may have arisen in a manner that allowed the species to slowly adapt over an extended time period, rather than rapidly through inbreeding. This could have implications for the long-term survivability of the species. Inbreeding can rapidly purge alleles from a gene pool that has adaptive potential. However, the slower decrease in diversity in the narwhal could have given the species time to adapt to the decreasing diversity by selectively removing alleles that had little to no influence on the adaptive potential of the species. Our findings appear to reflect the current large abundance of the narwhal, rather than the low levels of genomic diversity. Moreover, similar findings of no inbreeding despite low diversity have previously been reported in other species (), indicating that this may be an important method of adaptation in multiple species.

To investigate whether adaptive potential (i.e., genetic diversity) may be present in certain regions of the narwhal genome, we estimated heterozygosity levels across the genome using 500-kb non-overlapping windows. A relatively high number of regions of high heterozygosity among a general genome-wide pattern of low diversity could suggest that these regions are tied to the adaptive potential of the species and therefore need to retain these higher levels of diversity for the species to persist. We find that heterozygosity follows a normal distribution with very little variation (SD = 0.0000643), indicating an even distribution of diversity across the genome ( Figure 2 ). To uncover whether the evenly distributed genome-wide low diversity is unique to the narwhal, or whether it is somehow associated with its Arctic existence, we compared this distribution to that of four other endemic Arctic marine mammal species (beluga, bowhead whale, walrus, and polar bear [Ursus maritimus]) ( Figure 2 ). We find relatively less variability in the distribution of heterozygosity across the narwhal genome ( Table S4 ). This pattern is unique to the narwhal, as the other species have more similar variations in heterozygosity across their genomes. This finding is consistent regardless of whether our narwhal or the beluga genome () was used as the mapping reference ( Figure S1 ).

x axis represents the percentage of the window that is heterozygous, and y axis represents the percentage of the total windows.

To investigate genome-wide levels of genetic diversity in the narwhal, we estimated autosomal heterozygosity across the genome using site allele frequency likelihoods with the software, analyzing next generation sequencing data (ANGSD) (). The estimated heterozygosity value of 0.000138 was very low, when compared with available heterozygosity estimates from 11 other mammalian genomes () and three additional endemic Arctic marine mammals: beluga (Delphinapterus leucas), bowhead whale (Balaena mysticetus), and walrus (Odobenus rosmarus) ( Figure 1 B). This result is unexpected, owing to the large census size of narwhals (). The genome-wide low heterozygosity level is, however, in agreement with previous findings based on mtDNA and microsatellites (). Low genetic diversity despite wide distribution ranges and large population sizes has been reported in other large mammals (e.g., the brown hyena,; orca,), and may become a more common finding as more genomic datasets become available.

To investigate whether the demographic trajectory is unique to the narwhal, or is associated with an Arctic lifestyle, we investigated the demographic history of the published genomes of beluga, bowhead whale, walrus, and polar bear. We find each species had its own unique demographic trajectory ( Figure 4 ). This result is not unexpected, as each species has its own particular ecology, which would influence its demographic history. However, this result does suggest that simply living in the Arctic environment is in itself not the main driver behind the low genetic diversity seen in the narwhal.

A recent bottleneck could offer an explanation for the current low genetic diversity in the narwhal. We therefore estimated the demographic history of the narwhal genome using the pairwise sequentially Markovian coalescent (PSMC) model () ( Figure 4 ). We find no drastic recent bottlenecks, but rather find an ancient, gradual but constant decline in Ne beginning at ∼2.0 million years ago (Ma). The trajectory culminated in a persistently low Ne of ∼5,000 individuals starting ∼600 thousand years ago (kya), followed by an increase in Ne ∼100 kya and a subsequent rapid expansion ∼30–40 kya. The initial expansion coincides with the onset of the last glacial period ∼115 kya (), suggesting an environmental driver, possibly linked to an increase in Arctic sea ice. Similar patterns of expansion ∼30–40 kya have been documented in Arctic terrestrial mammals, e.g., mammoth, horse, and reindeer (), suggesting a significant ecological shift across the Arctic region at this time. The rapid recent increase in narwhal Ne is in agreement with previous hypotheses based on population genetic analysis of mtDNA control region sequences. Inferences suggested a rapid recent expansion from a small founding population at the end of the last glaciation (∼25 kya) (). The pattern of a long steady decline followed by constant low population size could explain the low genetic diversity in current populations. However, differentiating true changes in population size from changes in population structure and gene flow is difficult based on a single genome (). Therefore caution should be exercised when interpreting these demographic results from a single individual. Further work involving multiple individuals from several populations will provide essential additional information about the findings of the current study.

Pairwise Sequentially Markovian Coalescent model of the narwhal and four other endemic Arctic marine mammal species and δO levels through time based on data found in. x axis represents thousands of years before present (kya), upper y axis represents effective population size (×10,000), lower y axis represents δO (‰), and faded lines represent bootstrap values. Shaded gray area represents the last glacial period.

A stratigraphic framework for abrupt climatic changes during the Last Glacial period based on three synchronized Greenland ice-core records: refining and extending the INTIMATE event stratigraphy.

Comparative History of the Narwhal and Beluga

To gain further insight into the demographic history of the narwhal and its evolutionary relationship to the beluga, we estimated the timing of when viable admixture between the two species may have ceased. We used an F1 hybrid PSMC (hPSMC) model. Results from the hPSMC model and simulation analysis suggest that gene flow between narwhal and beluga, at least in the populations that these genomes come from, ceased between 1.25 and 1.65 Ma ( Figure S3 ). This result suggests that viable gene flow between the two species continued for a long time after divergence (∼4 Ma), but the time frame also suggests that any contemporary hybrids are likely unable to reproduce, or have such high selective pressures against them that they fail to add anything to the parental species' gene pools. The narwhal in this study was sampled in West Greenland, and the beluga individual was sampled in Hudson Bay, Canada, where the species ranges do not overlap. Investigating narwhal and beluga individuals from populations where their ranges do overlap (e.g., West Greenland), may uncover more recent dates.

Westbury et al., 2018 Westbury M.V.

Hartmann S.

Barlow A.

Wiesel I.

Leo V.

Welch R.

Parker D.M.

Sicks F.

Ludwig A.

Dalén L.

et al. Extended and continuous decline in effective population size results in low genomic diversity in the world’s rarest Hyena species, the brown Hyena. Xue et al., 2015 Xue Y.

Prado-Martinez J.

Sudmant P.H.

Narasimhan V.

Ayub Q.

Szpak M.

Frandsen P.

Chen Y.

Yngvadottir B.

Cooper D.N.

et al. Mountain gorilla genomes reveal the impact of long-term population decline and inbreeding. Robinson et al., 2016 Robinson J.A.

Ortega-Del Vecchyo D.

Fan Z.

Kim B.Y.

vonHoldt B.M.

Marsden C.D.

Lohmueller K.E.

Wayne R.K. Genomic flatlining in the endangered Island fox. Garde et al., 2015 Garde E.

Hansen S.H.

Ditlevsen S.

Tvermosegaard K.B.

Hansen J.

Harding K.C.

Heide-Jørgensen M.P. Life history parameters of narwhals (Monodon monoceros) from Greenland. Although life history does not offer a definite explanation as to the low genetic diversity in narwhal, demographic history could offer a viable explanation. Previous studies have suggested that long-term, slow declines in population size () or long-term low population size () can allow a species to persist despite low diversity, by reducing the strain of deleterious recessive alleles. The narwhal has exhibited both of these demographic characteristics. Furthermore, our findings of an increase in Ne starting ∼100 kya offers an explanation for the current abundance, despite the low genetic diversity in the species today. The low contemporary genetic diversity may in part reflect the longevity and long generation time in narwhals (), which could slow the increase of genomic diversity.

The sequencing and assembly of a narwhal nuclear genome provides insights into genome-wide genetic diversity patterns and the demographic history of the species. In a comparison with 14 other mammalian species, including four endemic Arctic marine mammals, we find the narwhal to have low genome-wide diversity, and a unique demographic history not shared with any other endemic Arctic marine mammals. Our analyses reveal that demographic history has been an important factor influencing patterns of genetic diversity in the narwhal and offers an explanation for the low diversity in this individual despite the large global abundance of the species. We do not find evidence that the low diversity was caused by rapid bottlenecks or inbreeding, both of which are common mechanisms used to explain this pattern. Rather, we propose that low diversity has been present in narwhals for an extended period of time. The species may have adapted to cope with this at a genome-wide level over time and has potentially reached a stasis as to how low its genetic diversity can go before influencing the long-term survival of the species. Our study sets the groundwork for future studies into the evolutionary history of narwhals, which will hopefully uncover more details as to the causes of low diversity and how it influences the long-term population survival of the species.