Recent studies have shown that humans have adapted to many different environments around the world. However, few studies have centered on Indigenous groups in the Americas. We present a comparative analysis of genetic adaptations in humans across North America using genome-wide scans for signals of natural selection in three populations inhabiting vastly different environments. We find evidence for adaptation to cold and high latitudes in an Alaskan population, whereas infectious disease was a strong selective pressure in the southeastern United States and central Mexico. Because there are few shared signals of selection between populations, these sweeps likely occurred after population differentiation in the Americas. This study fills an important gap in our knowledge of genetic adaptations in humans.

While many studies have highlighted human adaptations to diverse environments worldwide, genomic studies of natural selection in Indigenous populations in the Americas have been absent from this literature until very recently. Since humans first entered the Americas some 20,000 years ago, they have settled in many new environments across the continent. This diversity of environments has placed variable selective pressures on the populations living in each region, but the effects of these pressures have not been extensively studied to date. To help fill this gap, we collected genome-wide data from three Indigenous North American populations from different geographic regions of the continent (Alaska, southeastern United States, and central Mexico). We identified signals of natural selection in each population and compared signals across populations to explore the differences in selective pressures among the three regions sampled. We find evidence of adaptation to cold and high-latitude environments in Alaska, while in the southeastern United States and central Mexico, pathogenic environments seem to have created important selective pressures. This study lays the foundation for additional functional and phenotypic work on possible adaptations to varied environments during the history of population diversification in the Americas.

Since first leaving their ancestral environments in Africa more than 100,000 y ago, humans have spread to nearly every region of the planet. In doing so, different populations have been exposed to many new environments and selective pressures, and they have developed a diversity of adaptations as a result (1). The declining cost of array and sequencing technologies and the improvement of methods for detecting signals of natural selection have allowed researchers to answer questions about selective pressures across a growing number of populations worldwide (2⇓–4).

However, very little is known about the recent evolutionary history of Indigenous populations in North America and the selective pressures that they have experienced. The Indigenous peoples of North America are underrepresented in the population genetics literature as a whole (5) and in studies of selection in particular. Only a handful of genomic studies of natural selection have been conducted in the Americas, and the majority of these have focused on populations in South America (6⇓⇓–9). To our knowledge, only two genomic studies of selection in North American populations have been published. Lindo et al. (10) found evidence of a complex history of selective pressures on the immune gene HLA-DQ1 using exome data from ancient and modern populations in the Canadian Pacific Northwest. Another study with the Greenlandic Inuit found evidence of selection in the FADS genes, which code for fatty acid desaturases that are associated with polyunsaturated fatty acid levels in the blood (11), as well as in the genomic region encompassing TBX15, which plays a role in the differentiation of brown and white adipocytes. The authors suggested that these signals of selection are likely related to adaptation to cold environments.

Here, we present genome-wide scans for natural selection across three populations from different regions of North America. We find evidence of adaptation to cold and high latitudes in an Alaskan Native population from the Arctic and evidence of selection at several genes related to inflammation and immune function in Indigenous populations from the southeastern United States and central Mexico. We find little overlap between putatively selected genes in these three populations, suggesting that local selective pressures in each geographic region have shaped these Indigenous North American populations differently since they settled in distinct regions of the continent.

We next looked for shared signals of selection among the three study populations by comparing statistically significant results from each analysis. We found no shared signals of selection among all three populations but did identify some putatively selected genes shared between pairs of populations. We found one shared signal of selection at a single SNP within the gene SLIT2 in the Alaska and southeastern US populations. We also found that the southeastern US and central Mexico populations share signals of selection at two SNPs each in the genes MUC19 and CNTN1 ( Table 2 ). We found no shared signals of selection between the Alaska and central Mexico populations. The greater percentage of putatively selected genes shared between the southeastern United States and central Mexico may be due to either similar selective pressures on both populations or the more recent divergence of the southeastern US and central Mexico populations if the selective pressures primarily occurred before they diverged.

We found 153 putatively selected SNPs for the Alaska population, 104 such SNPs for the southeastern US population, and 190 such SNPs for the central Mexico population. Genes with the strongest signals of selection (i.e., those with the largest number of SNPs putatively under selection) for each population are listed in Table 1 along with the average population branch statistics and iHSs for the significant SNPs. A complete list of the putatively selected SNPs for each population is available in Dataset S1 .

We computed two statistics to identify potential signatures of natural selection in the three study populations. We calculated the population branch statistic for each autosomal SNP in each population using individuals from the 1000 Genomes Peruvian population (PEL) without recent European or African ancestry as an ingroup and the 1000 Genomes Han Chinese population (CHB) as an outgroup. The population branch statistic computes the amount of genetic differentiation at a given locus along a branch leading to a population of interest by comparing transformed pairwise F ST values between each pair of three populations ( 4 ). A population’s population branch statistic value at a given locus corresponds to the magnitude of allele frequency change relative to its divergence from the other two populations. This approach has proven to be powerful at detecting recent signals of selection ( 4 , 10 ). We also calculated the integrated haplotype score (iHS), a widely used haplotype-based method of detecting signals of selection, for each autosomal SNP in each of the three study populations. The iHS is a measure of extended homozygosity in the haplotype surrounding a given SNP. Extended stretches of homozygosity relative to the background are a signal of a selective sweep that has not yet reached fixation. P values were calculated for both population branch statistics and iHS using a distribution of each statistic simulated under a demographic model specific to each study population. We then identified the top 1% P values for population branch statistics and iHS from each population ( Fig. 3 and SI Appendix, Fig. S6 ) and cross-referenced them to find SNPs that were significant outliers in both statistics. This approach, which has been used previously ( 15 ), should reduce our chances of reporting false positives, as the iHS has been shown to be robust to demographic history that is often a confounding factor in F ST -based approaches, such as population branch statistics ( 16 ).

Because many previously genotyped Indigenous populations in the Americas trace a large percentage of genetic ancestry to recent European and African ancestors, which can influence results of genome-wide scans for selection, we first conducted a nonhierarchical clustering analysis of the SNP data implemented in the program ADMIXTURE ( 13 ). Fig. 2A shows that many of the Alaska and southeastern US individuals have more European ancestors than the individuals sampled in central Mexico. Local ancestry assignment was then done using RFMIX ( 14 ) to assign each chromosomal segment to its most likely ancestral source for each individual in the dataset. To minimize the effects of recent admixture on our selection scans, we masked SNPs from the data for an individual if they were in a section of chromosome inferred to have been inherited from a non-Indigenous ancestor ( Fig. 2 B and C ).

We collected DNA samples from 150 individuals from three Indigenous populations in North America ( Fig. 1 and SI Appendix, Figs. S1 and S2 ), including 35 Alaskan Iñupiat from the North Slope of Alaska, 47 individuals from the town of Xaltocan in central Mexico, and 68 individuals from several closely related communities in the southeastern United States (populations referred to hereafter as Alaska, central Mexico, and southeastern United States, respectively). In some cases, exact sampling locations and community names are not reported to protect the privacy and anonymity of both the individuals and communities participating in this research. These protections were developed in collaboration with community members and are part of the IRB protocol and informed consent documentation used in this study. We then used the Affymetrix Axiom Human Origins Array to genotype 629,443 genome-wide SNPs for each of these individuals. A total of 563,162 SNPs were included in our analyses after quality control filtering and merging with the 1000 Genomes dataset ( 12 ) for comparative analyses.

Discussion

Our results suggest that different selective pressures have been acting on the three study populations sampled from different regions of North America. In the Alaska population, we see evidence for adaptation to both cold and high-latitude environments. The Alaskan Arctic has a tundra climate (Et on the Köppen climate classification) characterized by at least 1 mo with an average temperature >0 °C but no months with an average temperature above 10 °C (19). Being above the Arctic Circle, this region is also subject to low levels of direct sunlight and intermittent periods of complete daylight or darkness. Alaskan Native groups and other Indigenous peoples in the Arctic have developed a number of cultural adaptations in response to this extreme environment. The traditional diet of Alaskan Native peoples in this region relies heavily on both terrestrial (caribou) and marine (seal, whale) mammal resources, as the Arctic environment has low levels of surface vegetation and soil development (20). Traditional Alaskan Native clothing and dwellings are also designed to provide shelter from the extremely cold environment. Our results suggest that genetic adaptations have also arisen in this population.

The three genes that have the strongest evidence of selection in the Alaskan Arctic population could all have adaptive effects in cold, high-latitude environments. The gene HS3ST4 is involved in the production of heparan sulfate, a molecule that affects blood thickness. Previous work has shown that extended exposure to cold temperatures increases blood thickness, which can lead to a number of deleterious effects (21). The gene KCNH1 is involved in the regulation of cell proliferation and differentiation, in particular adipogenic and osteogenic differentiation in bone marrow-derived stem cells (22). Selection on these genes involved in regulating blood thickness and fat storage, respectively, suggests that a cold-resistant phenotype may be under selection in this Alaskan Arctic population. Previous studies of other Arctic populations have similarly found evidence of adaptation to the cold climate, with signals of selection found in genes related to metabolic pathways (11, 23), albeit in different genes than those identified in this study.

OCA2, also under selection in the Alaskan Arctic population, is associated with skin/eye/hair pigmentation in humans, suggesting the possibility of adaptation to high latitudes in this group. Vitamin D is an essential nutrient that is important for skeletal development and the innate immune response among other processes. In humans, the majority of vitamin D synthesis takes place in the skin as a result of the interaction between cholesterol and UVB radiation from sunlight. High-latitude regions, such as Alaska, are exposed to much lower levels of UVB radiation than other parts of the globe, making it difficult for people living in these areas to maintain healthy levels of vitamin D (24). Previous work has shown that variation in the OCA2 gene is correlated with the amount of winter solar radiation (25).

In the southeastern United States, we see the strongest signal of selection on SNPs in and around the gene IL1R1, which codes for a cytokine receptor that plays a key role in the adaptive immune response. Selection on IL1R1 in the southeastern United States makes sense given the colonial history of this region. The colonial period saw the introduction of a variety of diseases into the Americas, including smallpox, measles, influenza, pertussis, cholera, plague, typhus, yellow fever, diphtheria, malaria, and influenza (26). One recent spatial model of the colonial spread of epidemic diseases in North America (27) suggests that such diseases were first introduced during European colonization of coastal areas of the Southeast in the early 16th century, spread slowly toward the Appalachian mountains over the next 140 y, and then, moved very quickly across the interior Southeast. This model is consistent with the history of the region: after the initial Spanish colonization of the coastal Southeast in 1513, five documented expeditions (entradas) were undertaken to map the Southeast before 1545. Interaction between these entradas and Indigenous groups along with the establishment of the Spanish mission system throughout the Southeast likely contributed to the introduction and spread of multiple infectious diseases in this region. Several accounts of disease epidemics in the coastal Southeast are also described in historic records beginning in 1520 and continuing through the early 18th century (28). Groups in the interior Southeast likely avoided the very first epidemics but were affected later (29). While the causal pathogens of many early epidemics remain unknown, accounts of some later epidemics allow the underlying cause to be identified, such as several accounts of a smallpox epidemic in the late 1690s that report its spread from Virginia down into the Carolinas and across the Southeast into Mississippi (30). Altogether, historical documents, ethnohistoric records, Indigenous histories, and archaeological evidence demonstrate that these epidemics in conjunction with other events and practices during the colonial era contributed to a significant population decline and major sociopolitical changes in the region. By the 18th century, for example, many of the Indigenous groups interacting with the Spanish, including those forced into the mission system, had merged, and the ethnogenesis of many of the modern Indigenous southeastern groups was beginning (31). Our results suggest that the repeated epidemics may have also created significant selective pressures, influencing patterns of genetic variation at loci associated with the human immune response in these populations.

Pathogenic environments seem to have been a major selective force in central Mexico as well, as many of the putatively selected genes in this population are also related to immune system pathways. In Mexico, the spread of European-introduced diseases began shortly after Spanish conquistador Hernán Cortés landed near the present-day city of Veracruz in 1519 and began his military campaign against the Aztec empire (32). In central Mexico, the first documented epidemic was smallpox in 1519–1520 as Cortés marched toward the Aztec capital city of Tenochtitlan, located in present-day Mexico City in central Mexico. This initial epidemic was followed by several subsequent smallpox outbreaks, particularly in the late 16th and early 17th centuries (32). After Tenochtitlan was conquered in 1521, it became known as Mexico City, capital of the Viceroyalty of New Spain, and received a large influx of people from both Europe and Africa (33), no doubt bringing additional pathogens along with them.

This history likely contributed to genomic signatures of selection seen in the central Mexico population in this study. We see the strongest signal of selection on the mucin gene MUC19 in the central Mexico population. Mucin genes are primarily involved in the immune response to parasitic infection (34). Past work has shown that parasite load is strongly correlated with latitude, with populations closer to the tropics having higher levels of parasitic infection (35). Interestingly, the GenomeRNAi database (36) shows that MUC19 is associated with decreased vaccinia virus (VACV) infection. VACV is a close relative of the variola virus, the causal agent of smallpox, and recombinant versions of the VACV were used as a vaccine against smallpox until it was eradicated in the late 1970s (37). However, while these results are suggestive, we cannot be certain that smallpox was the selective agent for this sweep, because there were many more infectious diseases spreading through Mexico at this time (32). Recent work using novel methods to search for ancient pathogen DNA in human ancestral remains has successfully identified some of these unknown pathogens, such as the bacterium Salmonella enterica as a possible causal agent of the 16th century “cocoliztli” epidemic in southern Mexico (38). Future work may help us identify the major pathogens afflicting the people of central Mexico during colonial times.

These results suggest that selective pressures have varied widely across the Americas. The value of investigating selective pressures at a regional level in human populations is becoming increasingly recognized as an important topic of study. In South America, for example, recent work has examined the genetic components of the well-studied adaptation of Andean populations to high altitude (7). In addition, two recent studies have shown evidence of adaptation to an arsenic-rich environment in Andean populations from northwest Argentina (8, 39). Both studies found signals of selection on the gene AS3MT, which is involved in the metabolism of arsenic. Variants identified in these Andean groups allow them to metabolize less of the toxic element. Another recent study of Andean populations farther north in Peru used genomic data collected from both ancient and modern populations to study evolutionary pressures through time (40). The gene showing the strongest signal of selection in this study was MGAM, which is associated with starch digestion. This suggests that the transition to agriculture in this region may have also been a major selective pressure in the past. The only study to our knowledge to look at the history of selective pressures in North America was done with the Tsimshian of British Columbia (10). This study found evidence of selection positive selection at the gene HLA-DQA1 in the Tsimshian population before European colonization and possible evidence of negative selection in that region afterward. These studies conducted with populations thousands of miles apart from each other and in a variety of different ecological environments demonstrate the complex history of human adaptation to the varied environments of the Americas.

Altogether, our analysis of genome-wide signals of selection in three Indigenous populations in North America found evidence for selection on genes related to cold and high-latitude environments in Alaska but selection on genes related to immune function in the southeastern United States and central Mexico. Additional studies may find evidence of other adaptations in different environments on these continents.