Our study sample is part of the CANDELA (Consortium for the Analysis of the Diversity and Evolution of Latin America; http://www.ucl.ac.uk/silva/candela) cohort collected in five Latin American countries (Brazil, Colombia, Chile, Mexico and Peru) for the study of the genetics of physical appearance7. This sample consists of individuals of both sexes (median age 22 years), with a mixed African, European and Native American genetic background (Supplementary Table 1). Using facial photographs, we performed a qualitative assessment (on a three-point-ordered categorical scale; Fig. 1; Supplementary Figs 1 and 2) of 10 pinna traits in 5,062 individuals: ear protrusion, lobe size, lobe attachment, tragus size, antitragus size, helix rolling, folding of antihelix, crus helix expression, superior crus of antihelix expression and Darwin’s tubercle.

Figure 1: Genome-wide associations of pinna traits. Variation in 10 pinna traits was evaluated in 5,062 individuals. The photographs at the top indicate the location of the traits examined. At the bottom is shown a composite Manhattan plot for the seven traits showing genome-wide significant association with at least one genome region. The rs numbers for the most significantly associated (index) SNP in each region are provided (Table 1). Each of the seven regions on the Manhattan plot is connected with the associated trait on the photos via a line of different colour (composite panels in this and subsequent figures were made using Photoshop49). Full size image

Variation in the pinna traits examined

The trait scores show a weak-to-moderate correlation between them, and with age, sex, body mass index (BMI) and genetic ancestry (Supplementary Table 2). Most noticeably, lobe attachment shows a moderate and significant (permutation P value <0.001) negative correlation with lobe size (r=−0.49). Significant but weaker positive correlations were observed for folding of antihelix with helix rolling (r=0.25) and with superior crus of antihelix expression (r=0.23), as well as between ear protrusion and helix rolling (r=0.16).

Individuals were genotyped on Illumina’s Omni Express BeadChip. After quality control, 671,038 single-nucleotide polymorphisms (SNPs) and 4,919 individuals were retained for further analyses. Autosomal admixture proportions for the full sample were estimated as: 53% European, 43% Native American and 4% African (Supplementary Fig. 3). On the basis of a kinship matrix derived from the genome-wide SNP data8, we estimated narrow-sense heritability using GCTA9 and found moderate and significant values for all traits, with the highest heritability observed for ear protrusion (61%) and the lowest for tragus size (25%) (Supplementary Table 3). Similar heritabilities have been calculated for a range of facial traits using family data10,11.

Primary association analyses

For the primary genome-wide association tests, we applied multivariate linear regression, as implemented in PLINK12, using an additive genetic model adjusting for: age, sex, height, BMI and the first five principal components (PCs) calculated from SNP data. The resulting statistics showed no evidence of residual population stratification for any of the traits (Supplementary Fig. 4). To account for the possibility of cryptic relatedness between individuals, we also performed genome-wide association tests using random-effects mixed linear regression models (using FastLMM13) and obtained similar results as in the PLINK analyses (Supplementary Table 4). Six of the traits examined showed genome-wide significant association (linear regression P values <5 × 10−8) with SNPs in at least one of the seven genomic regions (Fig. 1; Table 1). A global false discovery rate test across all traits and SNPs identified the same significantly associated regions (Supplementary Table 5). Lobe size is associated with SNPs in four regions (2q12.3, 2q31.1, 3q23 and 6q24.2); three traits show association with two regions: lobe attachment (2q12.3 and 2q31.1), helix rolling (2q12.3 and 4q31.3) and antihelix folding (1p12 and 18q21.2). The remaining two traits show association with a single region: ear protrusion (2q12.3) and antitragus size (1p12). Since the traits examined show some correlation between them, the associations detected are likely not fully independent. Most noticeably, the moderate negative correlation observed between lobe attachment and size is consistent with SNPs at the same two loci (2q12.3 and 2q31.1) showing significant association with both traits. Suggestive association with lobe attachment is also observed for 6q24.2, which is significantly associated with lobe size (Table 1).

Table 1 Chromosomal location and −log 10 (P) for index SNPs showing strongest genome-wide association to pinna traits. Full size table

Secondary analyses

Since correlations between traits could have a shared underlying genetic basis, we performed a multivariate analysis combining all phenotypes in a single regression model (using MULTIPHEN14). This analysis identified the same set of regions as in the independent regression tests, but no additional associated regions (Supplementary Fig. 5). We also examined the association signals for all index SNPs (Table 1) in each country separately and combined results as a meta-analysis using METAL15. For each association, the effects were in the same direction in all countries, the variability of effect size across countries reflecting sample size (Fig. 2; Supplementary Table 6; Supplementary Fig. 6a). There was no significant evidence of effect size heterogeneity across countries for any of the associations. A full genome-wide meta-analysis did not reveal additional regions showing significant association with pinna morphology (Supplementary Fig. 6b). The seven index SNPs of Table 1 provide modest phenotypic prediction accuracy (Supplementary Table 7). The fraction of the phenotypic variance explained by these SNPs is small relative to the heritability estimates (Supplementary Tables 3 and 7), suggesting a polygenic architecture for these traits beyond that captured by the genome-wide significance threshold used.

Figure 2: Meta-analysis of significant genome-wide associations. Effect sizes (in each country sample and in a combined meta-analysis) for the index SNPs and their associated traits (Table 1). Regression coefficients (x axis) estimated in each country are shown as blue boxes (box size indicating sample size). Red diamonds indicate effect sizes estimated in the meta-analysis. Horizontal bars indicate s.e. Results for all the SNPs and traits shown in Table 1 are provided in Supplementary Fig. 6A. The two alleles at each SNP are shown in brackets with effect size referring to the allele in the numerator. Full size image

Features of associated regions

The genomic regions showing genome-wide significant association have features with independent evidence suggestive of an involvement in pinna development. This evidence is particularly strong for the regions in 2q12.3 and 1p12, and we followed-up these two regions with additional experiments.

2q12.3

Includes SNPs associated with four traits (lobe size, lobe attachment, helix rolling and ear protrusion; Table 1). These SNPs extend over ∼500 kb and show substantial linkage disequilibrium (LD; Fig. 3b). Strongest association with all four traits was found for SNP rs3827760 (Table 1), and conditioning on this SNP abolishes the signal of association at other SNPs in the region (Supplementary Fig. 8a). Marker rs3827760 leads to a functional p.Val370Ala substitution in the intracellular death domain of EDAR. This residue affects the interaction with the EDAR-binding death domain adapter protein EDARADD16; the derived EDAR allele encodes a protein with higher activity than the ancestral one17,18. EDAR signalling acts during prenatal development to specify the location, size and shape of ectodermal appendages, such as hair follicles, teeth and glands19. The p.Val370Ala variant has been associated with characteristic tooth morphologies, hair type and sweat gland density in East Asians20,21,22,23,24,25, where this allele is present at high frequency while being nearly absent in European or African populations (Supplementary Table 8). Consistent with these effects in humans, mice expressing EDAR370A, or with increased EDAR function, show thickened and straightened hair fibres17,20,24,25.

Figure 3: LocusZoom and linkage disequilibrium plots of significantly associated genetic regions. Plots of the seven genomic regions showing genome-wide significant association to pinna traits (Table 1). For regions showing association with several pinna traits, we present here only results for the trait with strongest association (plots for the other associated traits are presented in Supplementary Fig. 7). Association results from a multivariate linear regression model (on a −log 10 P scale; left y axis) are shown for SNPs ∼500 kb on either side of the index SNP (that is, the SNP with the smallest P value, purple diamond; Table 1) with the marker (dot) colour indicating the strength of LD (r2) between the index SNP and that SNP in the 1000 Genomes AMR data set. Local recombination rate in the AMR data is shown as a continuous blue line (scale on the right y axis). Genes in each region, their intron–exon structure, direction of transcription and genomic coordinates (in Mb, using the NCBI human genome sequence, Build 37, as reference) are shown at the bottom. Plots were produced with LocusZoom56. Below each region we also show an LD heatmap (using D′, ranging from red indicating D′=1 to white indicating D′=0) produced using Haploview57. Note that the location of SNPs on the LD heatmap can be shifted relative to the regional display on top of it. Full size image

We examined Edar expression in the developing mouse embryo (Fig. 4a). The structure of the pinna is defined prenatally in both mouse and human, in mouse primarily between gestation days 13 and 16 (ref. 2). Using whole-mount in situ hybridization, we confirmed Edar expression along the distal margin of the developing pinna (Fig. 4a), in addition to the well-characterized expression in the developing hair follicles16. Edar expression at the distal margin of the embryonic pinna may aid in determining its growth and expansion, thus influencing the ultimate form adopted by the ear. To assess the functional significance of this Edar expression during pinna development, we examined postnatal pinna morphology in Edar mouse mutants. The EdardlJ (ref. 26) and EdarTg951 (ref. 16) mouse lines have a loss and a gain of Edar function, respectively (see Methods). At 2 weeks of age, the pinna of homozygous EdardlJ mice have a characteristic shape, including a marked dorsal/anterior folding (Fig. 4c). Quantitative assessment of mouse pinna protrusion (assessed by measuring protrusion angle) and shape (assessed by PC analysis of two-dimensional landmark coordinates) revealed significant differences (linear regression P value <0.0007; Supplementary Note 1; Supplementary Figs 9 and 10) between the homozygous EdardlJ/dlJ mutant and heterozygous EdardlJ/+ littermates, wild-type mice and EdarTg951 mice. Landmark coordinate PC1, capturing 69% of the variation in shape, reflects mainly a change in the extent of helix rolling of the mouse pinna (Fig. 4e; Supplementary Fig. 10d), consistent with one of the effects we observed of EDAR variants on human pinna shape. The Edar high copy-number transgenic (EdarTg951) does not display a detectable change in helix rolling likely due to the fact that the wild-type mouse pinna does not have a prominent helix roll, thus hampering the detection of any further pinna flattening that might be caused by increased Edar function.

Figure 4: Effect of Edar genetic variation on mouse pinna shape. (a) Whole-mount in situ hybridization detecting Edar expression in the developing mouse embryo at 15 days’ gestation. (b–e) Impact of Edar genotype on mutant mouse ear shape. (b,c) Photographs of wild-type (b) and homozygous EdardlJ (c) mutant mice from top and side views (respectively, on the upper and lower panels). (d,e) Boxplots, respectively, of ear protrusion angle and of landmark coordinate PC1 (y axis) for different mouse genotypes (x axis) (Supplementary Note 1; Supplementary Fig. 10 shows additional analyses for ear protrusion). Boxplot whiskers extend to data points within 1.5 times the interquartile range on both sides. In d,e numbers in parenthesis below genotypic categories refer to the number of mice examined for each. On the right of e are shown average PC1 wireframes for EdardlJ homozygous mice (bottom) or for mice with other genotypes (top). Full size image

1p12

SNPs in this region show genome-wide association with antihelix folding and antitragus size. This region extends over ∼800 kb and overlaps the gene encoding transcription factor TBX15 (Fig. 3a), a key regulator of cartilaginous and skeletal development in the mouse27. A spontaneous Tbx15 mouse mutant (called ‘droopy ears’), is characterized by altered positioning, projection and shape of the pinnae28,29. In humans, mutations of TBX15 result in Cousin syndrome, a disorder characterized by craniofacial dysmorphism, including a dysplastic pinna30. Strongest association in this region was observed for intergenic SNP rs17023457 (P value 2 × 10−8 for antitragus size and 1 × 10−11 for antihelix folding Table 1) and conditioning on this SNP abolishes the signal of association at other markers (Supplementary Fig. 8b). Interestingly, rs17023457 is located in a highly conserved binding site for transcription factor CART1 (cartilage paired-class homeoprotein 1) (Supplementary Fig. 11a), mutations of which have been shown to result in a range of craniofacial and cartilage abnormalities in the mouse31. The location of rs17023457 in a CART1-binding site strongly suggests that this SNP could directly influence the expression of neighbouring genes involved in cartilaginous development, such as TBX15.

To assess the potential for rs17023457 to alter DNA–protein interactions involving the CART1-binding site, electrophoretic mobility shift assays were performed using nuclear extracts from a CART1-expressing cell line (Huh7). Double-stranded oligonucleotides containing the rs17023457 T allele demonstrated binding of a nuclear protein, this binding being undetectable for oligonucleotides carrying the derived C allele (Supplementary Fig. 11b). Binding was eliminated upon prior incubation with excess unlabelled CART1 consensus sequence DNA, confirming the specificity of the assay and supporting the ability of the sequence containing the T allele to bind CART1 in vitro. Reporter constructs containing either the rs17023457 T or C alleles driving the expression of the luciferase gene, showed a decreased expression of 36 and 22% in constructs with the C allele when the genomic sequence was positioned in the forward and reverse orientation, respectively (Supplementary Fig. 11b). These data indicate that CART1, or other nuclear DNA-binding proteins with the same sequence specificity, is able to bind to an enhancer that includes rs17023457, with variation at this SNP determining whether binding and full transactivation occurs or not. The implications of these observations for the in vivo regulation of TBX15 expression remain to be established.

Other regions

For the other regions showing genome-wide association, there is currently less compelling information suggestive of a mechanism explaining their effect on pinna morphology, and we did not attempt their experimental follow-up. SNPs in 3q23 are associated with lobe size, with strongest association being seen for intergenic SNP rs10212419 (P value 3 × 10−14), in a region with substantial LD over ∼500 kb (Fig. 3d). Intriguingly, considering that the ear lobe is made up mainly of loose connective tissue, intergenic SNPs in this region have been strongly associated with keloid formation32, an exaggerated skin wound-healing reaction characterized by excessive deposition of extracellular matrix and collagen fibres. Some highly penetrant mutations across this genomic region have also been found to result in alterations of craniofacial development involving pinna morphology. The gene nearest to SNP rs10212419 (∼59 kb away) encodes mitochondrial ribosomal protein S22 (MRPS22), mutations of which can lead to combined oxidative phosphorylation deficiency 5 (COXPD5), a disorder whose phenotype can include low-set, posteriorly rotated ears (OMIM #611719). A report of a patient with a similar ear phenotype found a non-synonymous substitution at an evolutionary conserved site within MRPS22 (ref. 33). Another interesting candidate gene in the vicinity is that encoding the forkhead box L2 transcription factor (FOXL2). Coding and regulatory mutations of this gene cause BPES (blepharophimosis, ptosis, epicanthus inversus syndrome)34,35, a disorder characterized by a range of craniofacial abnormalities, including alterations in ear lobe morphology36.

The 2q31.1 region shows considerable LD over ∼100 kb and is associated with lobe attachment and size, with strongest association seen for intergenic SNP rs2080401 (P values of 9 × 10−12 and 1 × 10−10 for these two phenotypes, respectively; Fig. 3c). This SNP is located ∼31 kb upstream of the gene encoding Specificity Protein 5 (SP5), a Sp1-related transcription factor that mediates responses to Wnt/beta-catenin signalling. Of potential interest, there is evidence of an involvement of this pathway in musculo-skeletal development37,38. The region in 6q24.2, associated with lobe size, shows substantial LD over ∼500 kb overlapping the hypothetical protein-coding gene LOC153910. Strongest association was seen for intronic rs263156 (P value 2 × 10−13; Fig. 3f). Potentially of relevance to ear development, about 100 kb from rs263156 is GPR126 (G protein-coupled receptor 126), a gene strongly associated with human height39,40,41. Borderline genome-wide significance was found for rs1960918 in 4q31.3 (P value 2 × 10−8 for helix rolling; Fig. 3e) and rs1619249 in 18q21.2 (P value 1 × 10−8 for antihelix folding; Fig. 3g). Intronic marker rs1960918 is in an LD region of ∼400 kb overlapping the LRBA gene, whose product (LPS-responsive vesicle trafficking, beach and anchor containing) is known to be involved in coupling signal transduction, vesicle trafficking and immunodeficiency, with no obvious functional connection to pinna development. Intergenic SNP rs1619249 is in an LD region of about 300 kb, the closest candidate being LOC100287225 (about 100 kb from rs1619249), a hypothetical gene of unknown function.