Significance Our study provides insight into the genetic factors underpinning complex behaviors via comparative study of wild primates differing in social system. This research is among the first investigations of social behavior through population genomic scans for adaptive divergence in wild primate groups using an unparalleled sample set that spans decades. Our main conclusion, that a dopamine pathway underlies the social behavioral differences seen, offers comparative insight into the evolution of human behavioral and psychiatric phenotypes. We suggest that variation in impulsivity or boldness has played a major role in the evolution of socially complex species such as primates.

Abstract In the endeavor to associate genetic variation with complex traits, closely related taxa are particularly fruitful for understanding the neurophysiological and genetic underpinnings of species-specific attributes. Similarity to humans has motivated research into nonhuman primate models, yet few studies of wild primates have investigated immediate causal factors of evolutionarily diverged social behaviors. Neurotransmitter differences have been invoked to explain the distinct behavioral suites of two baboon species in Awash, Ethiopia, which differ markedly in social behavior despite evolutionary propinquity. With this natural experiment, we test the hypothesis that genomic regions associated with monoamine neurotransmitters would be highly differentiated, and we identify a dopamine pathway as an outlier, highlighting the system as a potential cause of species-specific social behaviors. Dopamine levels and resultant variation in impulsivity were likely under differential selection in the species due to social system structure differences, with either brash or circumspect social behavior advantageous to secure mating opportunities depending on the social backdrop. Such comparative studies into the causes of the behavioral agendas that create and interact with social systems are of particular interest, and differences in temperament related to boldness and associated with dopamine variation likely played important roles in the evolution of all social, behaviorally complex animals, including baboons and humans.

The search for neurophysiological links between genetic variation and complex behavioral traits has motivated studies of nonhuman primate models that parallel variation in human social behavior (1). For primatology, the neurophysiological and genetic processes underlying divergence in social behaviors are of particular interest, and species-specific behavioral traits in closely related taxa can be especially informative. There are, however, few investigations of such factors in the wild under natural conditions. One such study concerns populations of anubis and hamadryas baboons (Papio anubis and Papio hamadryas) and their natural hybrids in Awash National Park, Ethiopia (8.86°N, 40.06°E).

Although closely related (having diverged less than 1 Ma) and interfertile (2, 3), these two species show distinct social structures that largely result from consistent behavioral traits in adult males (4⇓⇓–7). Male anubis baboons almost invariably disperse from their natal group and join another before breeding, whereas most male hamadryas baboons are philopatric. Male anubis baboons exhibit strong interest in females only when the latter are adult and periovulatory, whereas male hamadryas baboons recruit females of any reproductive state into spatially coherent, jealously defended “one-male units,” the nucleus of which is often formed when a hamadryas male captures and “mothers” a preadolescent female. In contrast, anubis males’ access to fertile females is determined by agonistic competition and forming alliances with other males (8, 9), as well as, in the longer term, the ability to form “friendships” with females (9).

The two baboon species’ divergent behaviors can be viewed as resulting from differences in reward structures for impulsive or restrained social behavior, leading us to neurophysiological lines of inquiry. Our previous work has shown species differences in concentrations of monoamine neurotransmitter metabolites in the cerebrospinal fluid of adult males (10). Notably, the ratio between metabolites of dopamine [homovanillic acid (HVA)] and serotonin [5-hydroxyindoleacetic acid (5-HIAA)] is significantly higher in male hamadryas baboons, peaking in early adulthood, and largely attributable to HVA levels. Among naturally occurring male hybrids, the distribution of this neurophysiological trait (11) suggests a causal link between CNS dopamine activity and species-specific behavioral agendas, and the existence of an underlying species difference at the genetic level.

The present study tests this hypothesis by genotyping thousands of polymorphic markers in hundreds of individuals. The resultant dataset provides the resolution to identify regions of the genome that exhibit exceptionally strong interspecific differentiation, and to relate them to functionally defined physiological pathways. Such regions are considered likely candidates for involvement in adaptive divergence. Our previous work leads us to predict that genomic regions showing high differentiation between these two species would include loci involved in the function of monoamine neurotransmitters, especially dopamine. We examine exomes of a subsample of individuals to investigate the functional impact of highly differentiated regions.

Results and Discussion Using samples from 212 wild baboons collected between 1984 and 1998, we genotyped thousands of SNPs via double-digest restriction site associated DNA sequencing (ddRADseq) and computed F S T , a measure of differentiation, at each SNP between the two species. After smoothing point values around genes using the annotation information from the closely related rhesus macaque (Macaca mulatta), we evaluated over 160 functional pathways for an excess of high F S T values. We repeated the analysis with various values of the smoothing parameter, σ, and with two different rhesus annotation sets to ensure our results were robust to these inputs. Of these pathways, only two were identified as outliers significantly enriched for regions of high differentiation between the two baboon species for all values of the smoothing parameter (Table 1). One of these pathways, the dopamine receptor-mediated signaling pathway, is most obviously related to behavior, and is considered here. Table 1. Pathways enriched for differentiation Although the dopamine pathway as a whole was highly differentiated, most of its individual components did not show extreme F S T values, suggesting concerted evolution of multiple components of the pathway may be more important (Table 2). Our analysis clearly identifies the dopamine receptor-mediated signaling pathway as an outlier for high differentiation between the two baboon species, significant despite our use of a reduced representation dataset, the use of annotation information from a closely related species, and the fact that many components are not involved in the divergent selection. High F S T values were nonetheless observed for three genes across all values of smoothing parameter σ: SLC6A3, COMT, and PPP1CC. These outliers potentially play an outsized role in the divergent behaviors seen between the species, and are discussed below. Table 2. F S T and P F S T of dopamine receptor-mediated signaling pathway components Of the three dopamine receptor-mediated signaling pathway loci identified here as highly divergent between the baboon species, SLC6A3 (solute carrier family 6, member 3; or DAT1) is the most widely investigated. In humans, a variable tandem repeat in the 3′-UTR is associated with increased expression (and presumably lower synaptic dopamine) and reduced reward-related activity (12). In crab-eating macaques (Macaca fascicularis), comparable variants in the 5′-UTR of SLC6A3 have been associated with behaviors related to social rank and dominance (13). COMT encodes catechol-O-methyltransferase, an enzyme active in cortical dopamine metabolism, converting dopamine into 3-methoxytyramine, which, in turn, is metabolized by monoamine oxidase into HVA. In humans, a missense variant causing lower COMT activity (14) is associated with aggression and violence against self and others (15⇓–17). COMT is also a candidate gene for psychiatric phenotypes such as schizophrenia (18). Because the behavioral differences in the baboons are age- and sex- as well as species-specific, it may be relevant that there exists evidence of sexual dimorphism and age effects in associations between COMT and psychiatric phenotypes (19). Finally, PPP1CC [protein phosphatase 1, catalytic subunit, gamma isoform (PP1γ)] also displayed high differentiation. PPP1CC is involved in synaptic plasticity and has also been implicated in the striatal dopamine depletion seen in Parkinson’s disease (20). Although a component of the dopamine receptor-mediated signaling pathway, PPP1CC is also of interest in view of the extraneuronal function of the PPP1CC2 isoform (e.g., PP1γ2) in male reproductive physiology (21). In the baboons, contrasting mating patterns of the species result in very different levels of male-male sperm competition, which is reflected in a marked difference in adult testicular size and development (22). To explore the differentiated dopamine pathway in more depth, we sequenced the protein-coding portions of the genomes of several baboons. Analyses of exomes from eight anubis, hamadryas, and hybrid baboons identified loci that exhibited variants differing from the reference allele of the P. anubis genome (papAnu2; Baylor College of Medicine Human Genome Sequencing Center). Variants that have high functional impact (HI) or cause loss of function (LOF) are considered likely to be functionally and possibly adaptively significant. Of the 164 PANTHER pathways evaluated, 11 and seven included significantly more than expected HI or LOF variants, respectively. The dopamine receptor-mediated signaling pathway ranked second highest among these pathways (Tables S1 and S2). Table S1. PANTHER pathways significantly overrepresented for genes with HI variants Table S2. PANTHER pathways significantly overrepresented for genes with LOF variants The designation of these LOF or HI variants in the exome dataset is predicated on accurate annotation, for which we used the preliminary annotation information from the draft baboon genome. Although we attempted to minimize the chance of paralogy, checking for duplication of the dopamine pathway genes by confirming a single ortholog existed in the human genome and excluding genes with more than two alleles, uncertainty in annotations is unavoidable. Regardless, these results suggest that the dopamine pathway may be enriched in variation likely to have moderate to strong effects on the behavioral phenotype, and several loci within the dopamine receptor-mediated signaling pathway are flagged as focal points of functionally significant species divergence. These results support our hypothesis that the divergence between these baboon species in adult male behavior is associated with inherited, functionally significant, dopamine-related differences within the CNS. With the finding of species-associated divergence in dopamine-related genes, modification of dopamine function is identified as the probable cause, and not merely a correlate, of the behavioral difference. Given its established connection to social behavior variation and the known difference in dopamine metabolites between anubis and hamadryas baboons, it is especially exciting that the dopamine receptor-mediated signaling pathway was an outlier for high differentiation enrichment, and that several components showed excesses of HI variants, suggesting functional importance. This high differentiation may indicate adaptive divergence, with dopamine levels and resultant variation in impulsivity under differential selection in the species due to social structure differences, with either impulsivity or restraint advantageous depending on the social backdrop. In male hamadryas baboons, for instance, living among agnatic kin due to philopatry, impulsivity may be adaptive for young males, allowing them an early start to secure opportunistic mating (10). This analysis has identified likely genetic sites for the adaptation of a male temperament that takes advantage of this social context. Involvement of the dopamine system in the species-specific behavioral divergence is likely, because it involves species-specific, goal-directed behavior. Adult male hamadryas baboons, for example, evidently find close association with immature and reproductively inactive females more rewarding than do adult male anubis baboons. Overall, differences in temperament related to boldness and associated with dopamine variation likely played important roles in the evolution of all social, behaviorally complex, wide-ranging, and omnivorous animals, including baboons and humans. Our study begins to elucidate the genetic underpinnings of such precise, species-specific reward structures. Considerable gains in understanding complex traits such as behavior have come from previous gene expression studies on closely related species. However, such transcriptomic studies are difficult or impossible in wild primates due to sampling logistics and ethical issues. Genome-wide scans for high differentiation, such as in the present study, are a powerful alternative. Spurious identification of outlier annotations is a risk, with biases toward loci with etiological connections to human disorders as well as uncertainty in the annotations. Because baboon-specific annotation information was still in a draft state, we used two rhesus macaque annotation sets for the differentiation-based analyses, finding the dopamine pathway to be an outlier with both. For the exome-based analyses, we used the draft baboon annotations for added precision, and found impactful variation overrepresented in the dopamine pathway. However, some variants may be erroneously designated as HI or LOF due to errors in the draft annotation. (All animals were heterozygous for the LOF variant in PPP1CC, for example, possibly attributable to an error in annotation or unappreciated duplication.) Such issues will be ameliorated as improvements to model and nonmodel organism genomes continue (23). A major goal of comparative studies is the determination of the genetic underpinnings of species-specific behaviors. Neurophysiological differences are potential proximate causes of the inborn behavioral agendas that give rise to species-specific social systems, such as those different systems found between hamadryas and anubis baboons. Recent research has only begun to investigate how genetic factors, acting via neurophysiological pathways, determine individuals’ motivational agendas. This study is among the first studies in which genome-wide data from substantial numbers of individuals from closely related species have been used to investigate the genetics of divergent, highly species-specific behaviors. The identification of the dopamine receptor-mediated signaling pathway as a major focus of evolutionary divergence between these two species brings us a step closer to understanding the determinants of social behaviors. More practically, our findings indicate consideration should be given to animals’ ancestry in biomedical studies using baboons as models for human pathologies involving dopamine, such as Parkinson’s disease and schizophrenia.

Materials and Methods Samples. Genetic material was derived from an extensive collection of blood samples from wild anubis and hamadryas baboons trapped in Awash National Park, Ethiopia (8.86°N, 40.06°E) between 1984 and 1998. The animals were captured in baited cage traps (24), tranquilized, sampled, and released. Trapping was approved by the Institutional Animal Care and Use Committee (IACUC) at Washington University School of Medicine. Blood samples were drawn from the femoral vein into heparinized, evacuated tubes and fractionated by centrifugation. Plasma, red blood cells, and leukocyte-rich fractions were stored immediately in liquid nitrogen, and subsequently at −80 °C. The genomic DNA used here had previously been extracted from the stored leukocyte samples with the DNeasy Blood and Tissue Kit (Qiagen) according to manufacturer’s instructions or by phenol-chloroform extraction. Sequencing and Variant Identification. The panel of animals used in ddRADseq analysis included 232 individuals, of which 212 remained after filtering for inadequate data, as described below. To discover and genotype orthologous genome-wide variants, we created multiplexed libraries of up to 64 individuals using ddRADseq (25) and 10 units each of restriction enzymes SphI and MluCI. Each library was then sequenced on the Illumina MiSeq or on a lane of the Illumina HiSeq2500 in 150 or 100 cycles of paired end sequencing, respectively. A subset of the HiSeq libraries was also sequenced in 100 cycles of single end sequencing. We combined these reads with the first paired end read before mapping. We demultiplexed the reads using sabre (https://github.com/najoshi/sabre), allowing one mismatch, and trimmed the adaptor sequences with fastq-mcf (-l 15 -q 15 -w 4 -u) (26). We then mapped the reads to the baboon draft genome (papAnu2; Baylor College of Medicine Human Genome Sequencing Center) using BWA (Burrows–Wheeler Aligner) with default parameters (27). After realignment around indels using the Genome Analysis Toolkit (GATK) (28), variants were called for all individuals simultaneously using GATK’s UnifiedGenotyper (-stand_call_conf 50.0 -stand_emit_conf 10.0) and filtered with GATK’s VariantFiltration. Filtering criteria were as follows: QualByDepth (QD) < 2.0, mapping quality (MQ) < 40.0, FisherStrand (FS) > 60.0, HaplotypeScore > 13.0, MQRankSum < −12.5, and ReadPosRankSum < −8.0. Variants that passed the filtration were included in downstream analyses. We filtered the variants to exclude from further analysis those variants in repetitive regions and those variants with substantial missingness. Variants in repetitive regions were identified by RepeatMasker (29) (sensitive setting, RepBase library: RELEASE 20110920) and Tandem Repeats Finder (30) (filtered to keep repeats with period ≤12). After filtering, 229,394 variants remained, with a genotyping frequency of 40.70%. We then removed variants missing in ≥10% of individuals and filtered out 20 individuals with ≥ 90% missing data using PLINK (31) (–geno 0.1 –mind 0.9). The remaining dataset included 212 individuals and 45,643 variants, with a genotyping rate of 94.679% per individual. After removing variants with excessive missing data, we removed those variants in strong linkage disequilibrium. For each pair of variants in a sliding window, r 2 was calculated, and one SNP was removed at random for pairs with r 2 > 0.5 . The pruned dataset contained 32,410 variants. All such filtering was done with PLINK (31). Inference of Ancestry. To avoid complications arising from natural hybridization between the two species, we limited our analysis to animals lacking indication of mixed ancestry. The unsupervised clustering algorithm of ADMIXTURE was used to estimate individuals’ ancestry via maximum likelihood (32) using K = 2 . Individuals were considered pure if the inferred ancestry estimate of ADMIXTURE was greater than 0.999. By this standard, there were 114 unhybridized anubis baboons and 23 unhybridized hamadryas baboons. Estimation of Differentiation. We calculated pairwise F S T between the two species for each variant, using the equation of Weir and Cockerham (33). To smooth the estimates around genes, and thereby get an estimate for each gene’s local F S T from the nearby RAD-derived variants, we weighted the moving average using a Gaussian radial basis function kernel modified after Hohenlohe et al. (34). For the window centered at position c, estimates at distance p contributed to the average with weight exp ( − ( p − c ) 2 2 σ 2 ) , with σ = { 50,100,150 } kb. Weighted averages were computed on the region spanning c ± 3 σ . A smoothing step allowed our estimation of F S T for genes using the variants in randomly distributed RAD tags which allowed us to associate highly differentiated sites with functional annotation data. Because the baboon draft genome annotations were still in draft form, use of gene locations required translating coordinates to the phylogenetically closest annotated published genome, the genome of the rhesus macaque (Indian origin M. mulatta, rheMac2, Mmul_051212, January 2006) (35). We repeated the analysis with the annotation information from a separate macaque assembly (Chinese origin, rheMac3, BGI CR_1.0, October 2010) and confirm that our results are robust to choice of rhesus annotation set. Genes in all orthologous regions were obtained from the rhesus refGene database. We used LiftOver (36, 37) to convert coordinates between the baboon and macaque assemblies. We used a permutation test to assess the significance of F S T for regions surrounding genes, generating for each gene a value P F S T . Each variant in the region centered on the gene was sampled with replacement from genome-wide estimates of the population genetic statistic in question. The weighted average was computed as above on the random sample and compared with the actual estimates’ weighted average. Sampling was repeated until a P value, P F S T , could be accurately estimated as the proportion of weighted averages of sampled estimates that are more extreme than the actual value. Estimates for F S T began with 10,000 replicates for computational efficiency and were allowed to increase to 10 million if required for accuracy to compute the P value. The genes’ smoothed F S T values, and their P values, P F S T , were then used to explore patterns of interspecies differentiation across biological pathways as defined by PANTHER (38), identifying those patterns with extreme distributions. The enrichment test uses the Mann–Whitney U to test the null hypothesis that F S T or P F S T values for genes associated with a given pathway were randomly drawn from the overall distribution of P values for all genes examined (38). Analyses were completed using PANTHER’s database (38) and a custom R script. The reference distribution included all genes found in all successfully converted orthologous regions. For genes that fell into multiple regions, we used the average of all regions’ P values. Multiple test correction (e.g., Bonferroni) was not used in this exploratory analysis. Exome-Based Analyses. To characterize the coding regions of genes identified as outliers for interspecies differentiation further, we captured and sequenced exomes from a subset of eight individuals: two unhybridized anubis baboons, two unhybridized hamadryas baboons, and four hybrid baboons. DNA was enriched for the exome using the Roche SeqCap human kit. The exomes were multiplexed and sequenced on the Illumina HiSeq 2500 in “Rapid Run” mode. After demultiplexing and trimming, exome reads were mapped to the baboon reference genome and single nucleotide variants were identified and filtered as previously described for the ddRADseq dataset. Variants (i.e., alleles differing in sequence from those alleles of the reference genome) were annotated and analyzed using the annotation information from the baboon reference genome and SnpEff, which returns information on variants’ predicted impact (39). Possible paralogous variants that lacked 1:1 orthology with the human genome or were nonbiallelic were excluded as follows. We used LiftOver (36, 37) to translate the baboon exome variants to human genome (hg19/GRCh37) coordinates. Reversing the LiftOver translation (from human back to baboon) allowed us to identify and exclude variants that did not have 1:1 baboon/human orthology. To exclude potential paralogs more conservatively, we further excluded all variants from genes showing evidence of paralogy (more than two alleles) in any region other than upstream, intergenic, or downstream. From these annotations, we compiled lists of LOF genes and genes with HI variants (according to SnpEff). These gene lists were subjected to a statistical overrepresentation test with PANTHER to identify pathways overrepresented relative to a reference list via a binomial test (38). Macaque gene identifications were used as a reference list, because baboon gene sets were unavailable in PANTHER. In the dopamine receptor-mediated signaling pathway, variants predicted to have moderate impact (i.e., missense mutations) were further analyzed with Snap2 to predict animo acid substitution effect magnitude (40) (Table S3). Table S3. Components of the dopamine receptor-mediated signaling pathway with animo acid substitutions predicted by Snap2 to have an effect

Components of the dopamine receptor-mediated signaling pathway with variants with predicted impacts of at least moderate severity include the following: ADCY7, CLIC6, COMT, DDC, DRD3, EPB41L2, GNAI1, GNB1, KCNK3, NAAA, PPP1CC, PRKAR2A, PRKAR2B, SLC18A2, SNAP23, SNAP29, and TH.

Acknowledgments We thank the reviewers and the editor; J. A. Hodgson, A. S. Burrell, and J. P. Higham for helpful discussion; and the Ethiopian Wildlife and Conservation Organization and the Biology Department, Addis Ababa University for their assistance and collaboration. We further thank New York University (NYU) High Performance Computing (HPC) and the Langone Genomics Core. This work was supported by the National Science Foundation (Graduate Research Fellowship, Grant BCS-1260816, and Grant BCS-9615150), Wenner–Gren Foundation, Harry Frank Guggenheim Foundation, Center for Field Research/Earthwatch, and Leakey Foundation. The NYU Langone Genomics Core is supported by the NIH National Cancer Institute (Grant P30CA016087).