Although considerable progress has been made in understanding the genetic basis of morphologic traits (for example, body size and coat color) in dogs and wolves, the genetic basis of their behavioral divergence is poorly understood. An integrative approach using both behavioral and genetic data is required to understand the molecular underpinnings of the various behavioral characteristics associated with domestication. We analyze a 5-Mb genomic region on chromosome 6 previously found to be under positive selection in domestic dog breeds. Deletion of this region in humans is linked to Williams-Beuren syndrome (WBS), a multisystem congenital disorder characterized by hypersocial behavior. We associate quantitative data on behavioral phenotypes symptomatic of WBS in humans with structural changes in the WBS locus in dogs. We find that hypersociability, a central feature of WBS, is also a core element of domestication that distinguishes dogs from wolves. We provide evidence that structural variants in GTF2I and GTF2IRD1, genes previously implicated in the behavioral phenotype of patients with WBS and contained within the WBS locus, contribute to extreme sociability in dogs. This finding suggests that there are commonalities in the genetic architecture of WBS and canine tameness and that directional selection may have targeted a unique set of linked behavioral genes of large phenotypic effect, allowing for rapid behavioral divergence of dogs and wolves, facilitating coexistence with humans.

We focus on a candidate chromosomal region implicated in canine sociability, a trait arguably more central to the domestication process than increased social cognition, and the adjacent orthologous region that has been mapped to human WBS. We demonstrate that domestic dogs exhibit some of the key behavioral traits quantified in individuals with WBS, most notably hypersociability in the absence of superior social cognition. We integrate targeted resequenced data of the candidate canine WBS region with behavioral measures of sociability and cognition to disentangle the genetic underpinnings of this multifaceted behavioral trait. We find strong evidence that structural variation (SV) in our target region, which is orthologous to the region of the human genome affected by SV in WBS, also contributes to hypersociability in domestic dogs.

Here, we focus on sociability as informative on the divergence process of dogs from wolves during domestication. This canine behavioral gestalt was previously implicated in phenotype evolution in the dog genome through a genome-wide association scan of more than 48,000–single-nucleotide polymorphism (SNP) genotypes from 701 dogs from 85 breeds and 92 gray wolves with a Holarctic distribution ( 19 ). Using divergence, the top ranking outlier site was located within SLC24A4, a gene known to contain polymorphisms linked to eye and hair color variation in humans ( 19 ). The second ranking site was located within WBSCR17, a gene implicated in Williams-Beuren syndrome (WBS) in humans. WBS is a neurodevelopmental disorder caused by a 1.5- to 1.8-Mb hemizygous deletion on human chromosome 7q11.23 spanning approximately 28 genes ( 20 ). This syndrome is characterized by delayed development, cognitive impairment, behavioral abnormalities, and hypersociability ( 21 – 23 ). A number of other studies have taken a different approach and targeted genes linked to social behavior in other taxa. For example, targeted variation was surveyed in the dopamine receptor D4 and tyrosine hydroxylase, both genes extensively studied for their roles in the primate brain’s reward system ( 24 ). The study found an association between longer repeat polymorphisms with lowered activity and impulsivity in a limited survey of breeds. In a similar approach, variation surveyed at a regulatory SNP in the oxytocin receptor gene, also known to influence human pair bonding, was found to be associated with proximity seeking and friendliness in two dog breeds ( 25 ). However, behavioral genetic studies are still plagued with the challenge to understand the genetic architecture of nearly every facet of a complex behavior. Our study seeks to overcome this obstacle in the canine model.

Because of strict selective breeding rules, distinct dog breeds conform to a predictable phenotype. This population structure and isolation present the dog as a powerful model to explore the genetic underpinnings of complex traits such as behavior ( 9 ). Many dog breeds have been collectively scored using standardized tests for behavioral personality traits central to their domesticated nature (for example, playfulness, sociability, aggression, trainability, curiosity, or boldness) and breed-specific function (for example, herding, pointing, chasing, working) ( 9 – 17 ). Although there has been strong selection for breed conformation, interindividual variation suggests that genetics play a detectable role in shaping canine social behavior ( 18 ).

Although decades of research have focused on the unique relationship between humans and domestic dogs, the role of genetics in shaping canine behavioral evolution remains poorly understood. Existing hypotheses on the behavioral divergence between dogs and wolves posit that dogs are more adept at social problem solving ( 1 ) because of an evolved human-like social cognition ( 2 , 3 ). However, mounting evidence suggests that human-socialized wolves can match or exceed the performance of domestic dogs across these sociocognitive domains ( 4 ). Empirical demonstrations remain robust that dogs display exaggerated gregariousness, referred to as hypersociability, which is a heightened propensity to initiate social contact that is often extended to members of another species, when compared with wolves into adulthood. Hypersociability, one facet of the domestication syndrome ( 5 ), is a multifaceted phenotype that includes extended proximity seeking and gaze ( 6 , 7 ), heightened oxytocin levels ( 6 ), and inhibition of independent problem-solving behavior in the presence of humans ( 8 ). This behavior is likely driven by behavioral neoteny, which is the extension of juvenile behaviors into adulthood and increasing the ability for dogs to form primary attachments to social companions ( 4 ).

RESULTS

Solvable tasks and sociability measures We evaluated the human-directed sociability of 18 domestic dogs and 10 captive human-socialized gray wolves using standard sociability (7, 26) and problem-solving tasks (2, 8, 27) commonly used to assess human-directed sociability in canines. Three sociability metrics were constructed to assess behaviors indicative of WBS (22): attentional bias to social stimuli (ABS), hypersociability (HYP), and social interest in strangers (SIS) (tables S1 and S2). Solvable task performance was used to assess attentional bias toward social stimuli and independent problem-solving performance (independent physical cognition). Subjects were given up to 2 min to open a solvable puzzle box (8) that contained half of a 2.5-cm-thick piece of summer sausage, both when alone and with a neutral human present. The trial was considered complete after meeting one of the following conditions: The puzzle box lid was completely removed, the food was obtained, or 2 min had elapsed. All trials were video-recorded and coded for whether the puzzle box was solved and the time to solve it. To compare attention toward the puzzle box versus social stimuli under the human-present condition, we recorded the percentage of time spent looking at the puzzle box, touching the puzzle box, and looking at the human (8). We also had an independent researcher, who was blind to the purpose of this study, code 30% of the videos and found that interrater reliability was very strong (weighted Cohen’s kappa, κ = 0.98; 95% confidence interval, 0.97 to 0.99). Consistent with our hypothesis, domestic dogs spent a significantly greater proportion of trial time gazing at the human when compared to wolves when a human was present during the solvable task (median gaze toward human: dog, 21%; wolf, 0%; two-tailed Mann-Whitney, n dog = 18, n wolf = 10, U = 6, P < 0.0001). Dogs also spent a significantly smaller proportion of trial time looking at the puzzle box (median gaze towards box: dog, 10%; wolf, 100%; two-tailed Mann-Whitney, n dog = 18, n wolf = 10, U = 171.5, P = 0.0001) and a significantly smaller proportion of trial time trying to solve the puzzle (median: dog, 6%; wolf, 98%; two-tailed Mann-Whitney, n dog = 18, n wolf = 10, U = 175, P < 0.0001) compared to wolves, a finding that has been equated with social inhibition of problem-solving behavior in both the canine and human WBS literature (19, 22). Significantly more wolves successfully solved the task when compared to dogs under both the human present and alone conditions (human present: 2 of 18 dogs are successful, 8 of 10 wolves are successful; two-tailed Fisher’s exact test, P = 0.0005; alone: 2 of 18 dogs are successful, 9 of 10 wolves are successful; two-tailed Fisher’s exact test, P = 0.0001). Overall, concordant with WBS, dogs displayed greater ABS than wolves did, corresponding to a reduction in independent problem-solving success (fig. S1). The sociability test measured human-directed proximity-seeking behavior and was assessed by comparing total sociability scores across all sociability conditions. Each phase occurred twice, once with an unfamiliar human and once with a familiar human, totaling four phases run over eight consecutive minutes. In all phases, the experimenter sat on a familiar chair (dogs) or bucket (wolves) inside a marked circle of 1-m circumference denoting proximity. During the passive phase, the experimenter sat quietly on the chair or bucket and ignored the subject by looking down toward the floor. If the animal sought physical contact, then the experimenter touched the subject twice but did not speak or make eye contact with the animal. During the active phase, the experimenter called the animal by name and actively encouraged contact while remaining in their designated location. Consistent with our hypothesis, dogs spent more time in proximity to humans than did wolves (median percent of time spent within 1 m of humans: dogs, 65%; wolves, 35%; two-tailed Mann-Whitney, n dog = 18, n wolf = 9, U = 30, P < 0.005). Dog and wolf sociability toward an unfamiliar human was used to assess SIS. Consistent with our hypothesis, dogs spent more time within 1 m of a stranger when compared to wolves (median: dogs, 53%; wolves, 28%); however, this difference was not statistically significant (two-tailed Mann-Whitney, n dog = 18, n wolf = 9, U = 76, P = 0.51). In summary, dogs were hypersocial compared to wolves, although there was no significant difference in their SIS (fig. S1). We reduced the dimensionality of six behavioral traits (table S3) to three components that are orthogonal and uncorrelated to each other, whereas ABS, HYP (hypersociability), and SIS are correlated. Principal component 1 (PC1), PC2, and PC3 accounted for 50, 22, and 14% of total behavioral variation, respectively. We have calculated both Kaiser-Meyer-Olkin (KMO) (KMO = 0.62, with values of >0.6 recommended as informative) and Bartlett’s test, which was significant [χ2(15) = 60.42, P = 2.13 × 10−07]. Analysis of the loadings of the constituent behaviors (table S3 and fig. S1) indicated that PC1 represents an autonomous or independent phenotype, as this component is negatively correlated with all behaviors associated with human-directed sociability with the exception of “proximity unfamiliar passive.” PC1 also had positive loadings from “time look object,” a measure indicating a lack of ABS (fig. S2). Loadings of each behavior were roughly equal, with the exception of proximity unfamiliar passive, which had a loading approximately one-third the average magnitude of the others. Loadings of PC2 were heavily biased toward, and positively associated with, the measures of proximity to an unfamiliar person (average loading of 0.64, as compared to an average loading of −0.14 for the other loadings), suggesting that PC2 reflects boldness. The biological meaning of PC3 is more difficult to interpret, but given that it is strongly and positively loaded by the behavior “time look human” (loading of 0.63 compared to an average loading for all other factors of −0.15), it predominantly reflects reliance on humans in the solvable task test. As expected, given the interpretation of PC1 as socially inhibited phenotype, dogs had lower PC1 values than wolves (Mann-Whitney U test, U = 3, P < 0.00005; median: dogs, −1.18; wolves, 2.31). Dogs and wolves did not have significantly different values for PC2 (Mann-Whitney U test, U = 54, P = 0.57; median: dogs, −0.18; wolves, −0.19) or for PC3 (Mann-Whitney U test, U = 48, P = 0.35; median: dogs, −0.069; wolves, 0.011).

De novo annotation of structural variants In a subset of animals with quantitative behavioral data (n dog = 16, n wolf = 8), we collected paired-end 2x67nt sequence data from 5 Mb spanning the candidate canine WBS locus on canine chromosome 6 [2,031,491 to 7,215,670 base pairs (bp)], which contains 46 annotated genes, 27 of which are in the human WBS locus (tables S4 and S5; see Materials and Methods). The target region had an average of 15.5-fold sequence coverage (dogs, 15.2; wolves, 16.0) (table S5). We obtained genotypes for 26,296 SNPs, which we further filtered to retain 4844 SNPs with nonmissing polymorphic data (average density of 1 SNP for every 14.4 kb). To confirm this region as containing species-specific variation, we first determined whether this region displays signals of positive selection in the dog genome, an effort to independently validate the original (19). We calculated the composite bivariate percentile score and confirmed that the candidate gene, WBS chromosome region 17 (WBSCR17), is under positive selection as a domestication candidate and was significantly depleted of heterozygosity in dogs (mean H O : dog, 0.01; wolf, 0.37; one-tailed t test with unequal variance, P = 7.4 × 10−38) (fig. S3 and table S6). Because this candidate region shows SV linked to WBS in humans (20) and is known to vary widely in its functional consequences [for example, neurodevelopmental diseases (28) and autism spectrum disorders (29)], we completed in silico SV annotation in the dog and wolf genomes using three programs—SVMerge (30), SoftSearch (31), and inGAP-sv (32), which together use all available SV detection algorithms: read pair (RP), short reads (SR), read depth (RD), and assembly-based (AS). We annotated 38 deletions, 30 insertions, 13 duplications, 6 transpositions, a single inversion, and 1 complex variant relative to the reference dog genome (tables S7 and S8). There was considerable private variation, with 31 annotated SVs found only in dogs, 26 found only in wolves, and a level of heterogeneity observed in wolves that is comparable to that found in human WBS (mean n: wolf, 21; dog, 15; two-tailed t test, P = 0.026) (table S9) (33).

Candidate region association test Linear mixed models were used to determine the association of SVs with human-directed sociability. Three univariate models were tested for their association with each of the three behavioral indices (ABS, HYP, and SIS) (Fig. 1). In addition, we tested for the association of SVs with the three behavioral indices collectively, referred to as the behavioral index model, and separately with a model that included the first three PCs (PC model) describing human-directed sociable behavior (Fig. 2). Four genic SVs were significantly associated with human-directed social behavior (adjusted P < 2.38 × 10−3): one SV within GTF2I (Cfa6.66), one SV within GTF2IRD1 (Cfa6.72), and two within WBSCR17 associated with ABS (Cfa6.3 and Cfa6.7) (Table 1). In addition, two intergenic SVs were significantly associated with ABS (Cfa6.69, P = 1.56 × 10−4; Cfa6.27, P = 3.31 × 10−4), and Cfa6.27 was also associated with the PCs (P = 1.24 × 10−4). However, we focused our analyses on genic SVs to infer any potential functional impact. Cfa6.66 was associated with multiple sociability metrics (ABS and SIS) and had the strongest two association signals (P = 1.38 × 10−4 and P = 1.95 × 10−4, respectively) (Table 1). GTF2I and GTF2IRD1 are members of the transcription factor II-I (TFII-I) family, a set of paralogous genes that have been repeatedly linked to the expression of HYP in mice (34, 35), and are specifically implicated in the hypersociable phenotype of persons with WBS (36, 37). Fig. 1 Association of structural variants with indices of human-directed social behavior. Association with ABS (A), HYP (B), and SIS (C). Manhattan plots show statistical significance of each variant as a function of position in target region. Blue line denotes statistical significance to Bonferroni-corrected level (P = 2.38 × 10−3). Genic and intergenic variants are shown as green and red boxes, respectively. Fig. 2 Association of structural variants with human-directed social behavior in multivariate regressions. Association in behavioral index model (A) and PC model (B). Manhattan plots show statistical significance of each variant as a function of position in the target region. Blue line denotes significance to Bonferroni-corrected level (P = 2.38 × 10−3); dashed purple line denotes suggestive significance (P = 0.01). Genic and intergenic variants are shown as green and red boxes, respectively. Table 1 Genic loci associated with indices of human-directed social behavior across dogs and wolves. NA, not applicable. View this table: To disentangle the association of SVs with behavior from an association with species membership, we incorporated species as a covariate (table S10). These analyses were consistent with our initial findings for Cfa6.66, Cfa6.3, and Cfa6.7. Locus Cfa6.66 remained significantly associated with multiple sociability metrics (ABS, P = 2.33 × 10−4; SIS, P = 1.67 × 10−3) and showed the strongest association of any genic SV. Cfa6.3 and Cfa6.7 both retained their associations with ABS (P = 1.06 × 10−3 and P = 9.56 × 10−4, respectively), as did the intergenic SVs Cfa6.69 (P = 1.36 × 10−4) and Cfa6.27 (P = 5.56 × 10−4). Furthermore, the ABS effect size (β) remained stable for the association models with and without species membership as a covariate (ABS β without covariates: Cfa6.3 = 0.11, Cfa6.7 = 0.12, Cfa6.27 = −0.15, Cfa6.66 = 0.23, Cfa6.69 = −0.15; ABS β with covariates: Cfa6.3 = 0.081, Cfa6.7 = 0.10, Cfa6.27 = −0.13, Cfa6.66 = 0.23, Cfa6.69 = −0.14), indicating that the observed effects on sociability are not an artifact of species differences. An association test of each locus with species membership further supports this interpretation as none of the behavior-associated SVs significantly associated with species membership alone (table S11).

Functional impact of annotated structural variants We next determined whether these behavior-associated SVs were predicted to have a functional impact. We used Ensembl’s Variant Effect Predictor (VEP) v84 (38) with Ensembl transcripts for the CanFam 3.1 reference genome to assign putative functional consequences to all insertions, deletions, and duplications in the filtered set of SVs. Because of a software limitation that VEP is unable to assign consequences for transitions, inversions, and complex SV, we manually inspected seven sites (six TRA, one INV, and one D_I) in the UCSC (University of California, Santa Cruz) genome browser with Ensembl gene models (39). We found three transcription ablations, seven loss-of-start codons, and five transcript amplifications (table S12). All SVs significantly associated with human-directed social behavior were “feature truncations,” except for Cfa6.3, which was a “feature elongation” that is likely due to a lost stop codon or the elongation of an internal sequence feature relative to the reference. Annotation of Cfa6.3, Cfa6.7, Cfa6.66, and Cfa6.72 as modifiers of gene function suggests a direct association between these variants and human-directed social behavior, as quantified by our behavioral measures, mediated by possible interference with WBSCR17, GTF2I, and GTF2IRD1.

PCR validation and analysis of structural variants The in silico SV detection algorithms applied to the targeted resequencing data can identify the presence or absence of an SV but cannot predict the underlying genotype of an individual for a given SV. To corroborate the in silico findings and investigate the possibility of other genetic models, we used polymerase chain reaction (PCR) amplification and agarose gel electrophoresis to determine the codominant genotypes at the top four loci (Cfa6.6, Cfa6.7, Cfa6.66, and Cfa6.83) (fig. S4). These four SVs overlapped with short interspersed nuclear transposable elements (TEs) with high sequence identity to the reference (182 to 259 bp; 91 to 96% pairwise identity over 193 bp). We further surveyed insertional variation in 298 canids consisting of coyotes, gray wolves (representing populations from Europe, Asia, and North America), American Kennel Club (AKC)–registered breeds, and semidomestic dog populations (see Materials and Methods). We repeated the analysis with the codominant SV genotypes to determine whether there was an association with species membership. Coyotes were excluded from this analysis, and semi-domestic dogs were grouped with domestic dog. All outlier SVs, now with codominant genotypes, were significantly associated with species membership [Cfa6.6: χ2 = 23.91; P = 1.01 × 10−6; odds ratio (OR), 0.33; Cfa6.7: χ2 = 57.63; P = 3.16 × 10−14; OR, 13.83; Cfa6.66: χ2 = 35.12; P = 3.1 × 10−9; OR, 0.25; Cfa6.83: χ2 = 17.11; P = 3.53 × 10−5; OR, NA), confirming this region’s original identification (19). Similar results were obtained if we only included “modern” breeds, as per the original method that located this region (Cfa6.6: χ2 = 11.9; P = 0.0006; OR, 0.45; Cfa6.7: χ2 = 40.87; P = 1.63 × 10−10; OR, 10.35; Cfa6.66: χ2 = 41.97; P = 9.25 × 10−11; OR, 0.20; Cfa6.83: χ2 = 20.41; P = 6.24 × 10−6; OR, NA) (19), with site-specific patterns (frequency of TE insertion in modern dogs and wolves, respectively: Cfa6.6, 0.52 and 0.32; Cfa6.7, 0.39 and 0.06; Cfa6.66, 0.10 and 0.37; Cfa6.83, 0.17 and 0.00). We calculated the frequency of insertions per locus by population or species membership. The TEs segregated at low frequencies in coyotes and were variable across wolf populations and dog breeds (fig. S5). Only one coyote carried a single insertion of the TE at locus Cfa6.6, with both Cfa6.6 and Cfa6.7 highly polymorphic across domestic dogs (fig. S5, B and C). Locus Cfa6.66 is found in wolves from China, Europe, and the Middle East and in the WBS study wolves, but only within six dog breeds (boxer, basenji, cairn terrier, golden retriever, Jack Russell terrier, and Saluki), the WBS dogs, two New Guinea singing dogs (NGSDs), and a single pariah dog (fig. S5D). Cfa6.83 appears to be a de novo insertion within domestic dogs because it is lacking entirely within the wild canids (fig. S5E), with a low to moderate frequency within the semidomestic dog populations surveyed (pariah dog, n = 1; village dogs: Africa, n = 1; Puerto Rico, n = 5). Genetic analysis of only WBS dogs and wolves only, coupled with behavioral data, revealed trends per locus as follows: More insertions at Cfa6.6 were correlated with increased ABS and HYP (r = 0.50 and 0.42, respectively), with weaker relationships for SIS (r = 0.11); more insertions at Cfa6.7 correlated with increased ABS and HYP, with an inverse relationship with SIS (r = 0.13, 0.11, and −0.17, respectively); fewer insertions at Cfa6.66 is correlated with higher trait values (r = −0.59, −0.56, and −0.27 for ABS, HYP, and SIS, respectively); more insertions at Cfa6.83 increased all behavioral trait values (r = 0.36, 0.44, and 0.40 for ABS, HYP, and SIS, respectively). We conducted one-way analysis of variance (ANOVA) using the population or species designation as a predictor of the total number of insertions across four outlier loci. The total number of insertions depends significantly on the population (F 23,274 = 19.54, P < 2 × 10−16), with 103 of 276 pairwise population mean comparisons contributing to the ANOVA significance (dog/dog, 46; wolf/dog, 28; coyote/dog, 11; semidomestic/dog, 8; semidomestic/coyote, 3; semidomestic/wolf, 3; wolf/coyote, 2; wolf/wolf, 2; Tukey’s post hoc test, P < 0.05) (fig. S6). Because the gel-based genotyping method now reveals a codominant genotype compared to the in silico status, we conducted an association scan for each of the four outlier SV loci with the binary phenotype for each AKC breed (40), village dogs, and pariah dogs as “seeks attention” or “avoids attention” using two logistic regression models in R, an additive and dominant model, with sex as a covariate. The use of breed-based stereotypes is supported by the strict genetic isolation and selective breeding efforts that maintain breeds. Hence, many traits strongly determined by genetic variation (including behavioral) can be predicted with high accuracy. The central foundation and advantage of domestication and breed formation are that selection for many traits, including behavior, has been very strong; thus, the number of underlying genes is apt to be small. As proof of principle, Jones et al. (9) successfully mapped a variety of breed-associated traits in a genome-wide association study using dog “stereotypes.” They scored breeds for pointing, herding, boldness, and trainability and identified one locus associated to pointing, three for herding, one for trainability, and, most importantly, five for boldness. These loci contain likely candidate genes, many of which are important in schizophrenia, dopamine receptors, and proteins linked to synaptic junctions. Vaysse et al. (16) also used breed stereotypes to map behaviors, such as boldness, sociability, curiosity, playfulness, chase-proneness, and aggressiveness. They mapped boldness to an intron of HMGA2 and sociability, defined as the “dog’s attitude toward unknown people,” to a gene on the X chromosome after excluding male dogs from the analysis to accurately compare autosomal and sex-chromosome patterns of genetic variation. We found significant support for an association between three of the four loci and the binary behavioral trait of seeking or avoiding attention (additive model: Cfa6.6, OR, 0.303; P = 2.79 × 10−10; Cfa6.7, OR, 0.398; P = 4.66 × 10−7; Cfa6.83, OR, 2.95; P = 2.83 × 10−4; dominant model: Cfa6.6, OR, 0.184; P = 8.22 × 10−7; Cfa6.7, OR, 0.287; P = 4.31 × 10−5; Cfa6.83, OR, 5.04; P = 6.50 × 10−4; sex was not a significant predictor in any of these models). SV Cfa6.66 was not significant (additive model: OR, 0.852; P = 0.496; dominant model: OR, 0.573; P = 0.124). Further, our logistic regression found that TE copy number could significantly predict the binary breed stereotype behavior of attention seeking or avoidance (OR, 0.676 per insertion; P = 1.13 × 10−5, with no evidence of a sex effect).