Sample selection and sequencing

Samples were selected among controls of a longevity study52 (Albert Einstein College of Medicine; n=74) and a Parkinson’s study53,54 (Columbia University Medical Center; n=54). The average age was 69 years. Some medically relevant phenotypes are given in Supplementary Table 1. Genotype data were used to validate Ashkenazi ancestry and the absence of cryptic relatedness. Informed consent was obtained in accordance with institutional policies and the study was approved by the corresponding institutional review boards. Sequencing was carried out by Complete Genomics, to average coverage >50 × , in three batches (Supplementary Note 1).

QC and processing pipeline

Raw sequencing summary statistics are reported per sample and per batch in Supplementary Data 2. Copy number variants and mobile element insertions were also reported; however, the false-positive rate was high (see below and Supplementary Note 2). All samples were previously genotyped on SNP arrays; concordance was measured using CGA tools and averaged 99.67% over all samples. The discordance was correlated with the array missingness, but not with sequencing metrics; extrapolating to the limit of no array missingness, the discordance approached 0.047% (Supplementary Note 2).

Genotypes calls across individuals were merged using CGA tools and converted to VCF or Plink55 formats. Some of the analyses were carried out on 57 genomes sequenced in the first batch. Otherwise, we used the entire cohort (n=128). The merged genotypes were filtered by removing low quality and half-called variants, multiallelic and multinucleotide variants, variants not called as non-reference in any genome, variants with a no-call rate >10% (6% for the first batch), variants not in Hardy–Weinberg equilibrium (P<10−6), and variants outside the autosomes. For some analyses, we excluded a single genome containing an exceptional amount (≈200 MB) of runs-of-homozygosity. We validated that monomorphic non-reference variants that we observed were monomorphic (or high frequency) in Complete Genomics’ and 1000 Genomes’ public sequencing data sets (Supplementary Note 2).

To validate the Ashkenazi ancestry of our samples, we merged the AJ data set with Middle Eastern and European individuals from HGDP56 and with the Jewish HapMap project5. After pruning SNPs in LD (leaving ≈48K SNPs), we ran smartPCA57. The PCA plot (Supplementary Fig. 1) demonstrates the absence of either outliers or any batch effect (Supplementary Note 2). We also verified the absence of cryptic relatedness (maximum pairwise (Plink) was ≈5.5%).

We estimated the false-positive rate using runs-of-homozygosity (inside which almost all heterozygous sites are due to errors), which we detected using Plink, after removing low frequency variants and LD pruning. We used high- and low-confidence sets of runs-of-homozygosity to obtain a lower and an upper bound, respectively, for the false-positive rate. After trimming each segment, we estimated the false-positive rate using the number of heterozygote sites along the segment (all variants or SNVs only, and in the original genotype calls or in the cleaned data set). There were overall ≈300–600 MB found in autozygous segments, harbouring a few thousands of heterozygous sites. Cleaning reduced the SNV false-positive rate by ≈3–4 fold to an extrapolated ≈6–8K per genome. The false-positive rate for non-SNVs was ≈6 times that of SNVs. We obtained an independent estimate of the error rate using a pair of duplicate genomes, reaching qualitatively similar conclusions (Supplementary Note 2).

The FL samples were mixed controls and cases from VIB in Ghent, Belgium. They were sequenced to coverage ≈70 × by Complete Genomics, albeit using an earlier pipeline compared with the AJ genomes. PCA validated the FL ancestry (Supplementary Note 2; Supplementary Fig. 1). The FL genomes (n=26) were merged and cleaned using a pipeline similar to that of the AJ genomes. We merged the cleaned FL genotypes with the cleaned genotypes of the 57 AJ genomes sequenced in the first batch. We removed any variants that appeared in the cleaned genotypes in one population, but were removed during QC in the other population (Supplementary Note 2), to avoid spurious population-specific variants. We phased the merged data set using SHAPEIT58, with parameters as recommended by the authors, and with the 1000 Genomes reference panel. We used the molecular phasing information (that is, linked heterozygotes calls) to estimate the switch error rate at ≈0.95% (≈0.3% for non-singletons). The merged and phased AJ–FL data set was used for most population comparisons.

Annotations

dbSNP annotations were from the UCSC Genome Browser59. Functional annotation for Fig. 1b was generated using ANNOVAR60. In Fig. 1b, the reported counts are means and s.d. over all AJ individuals. For each individual, we randomly selected a set of n=26 or n=127 other AJ individuals to serve as the reference panel.

Rate of variant discovery

The empirical rate of discovery of segregating sites in Fig. 1c is the average over 50 random orderings of the individuals in each cohort. The theoretical number of segregating sites for the Wright–Fisher model used an estimate of θ based on the average heterozygosity and standard coalescent theory61. For variable size populations, we used equations from62 (Supplementary Note 3). The demographic model we used (for each population separately) is a bottleneck followed by an exponential expansion. The parameters were inferred by fitting the allele frequency spectrum using ∂a∂i26 (see below and Supplementary Note 6). The higher predicted number of FL sites was significant (P<0.01) with respect to parametric bootstrapping of the demographic models (Supplementary Note 3). A picture similar to Fig. 1c was seen when computing the rate of discovery of non-reference variants. There, projection to larger samples was on the basis of the first three entries of the allele frequency spectrum and the method of33 (Supplementary Note 3; Supplementary Fig. 3).

The joint allele frequency spectrum

Initial inspection of the joint spectrum revealed a few thousands of highly differentiated variants (for example, AJ-specific variants of frequency >50%). We suspected that those variants were due to reference genome mapping discrepancy (hg18/hg19), which we confirmed using Complete Genomics’ public genomes resource (Supplementary Note 3). We therefore removed from further analysis ≈4,000 population-specific variants with frequency >25%. To facilitate population-genetic comparisons, we downsampled the joint spectrum to 50 AJ and 50 FL haploid genomes analytically using hypergeometric expectations. We folded and marginalized the spectrum using standard definitions (Supplementary Note 3; minor alleles were defined with respect to the combined sample; Fig. 3b). The Wright–Fisher expected spectrum (Fig. 3a) was computed using standard coalescent theory61. The panmictic spectrum of Fig. 3b was computed analytically assuming that the appearances of each variant are randomly distributed between AJ and FL (Supplementary Note 3). F ST was computed using ∂a∂i26.

IBD segment detection

To detect IBD segments, we first assigned genetic map distances using HapMap2 (ref. 63). We then ran Germline20 using a minimal length cutoff of either 3 cM or 5 cM, and in the ‘genotype extension’ mode12, which allows segments to extend as long as double homozygous sites are matching. We followed by filtering segments with particularly short physical length, overlap with sequence gaps or where all matching sites had the major allele. We further filtered segments by computing a score related to the probability of a segment to be truly shared-by-descent, on the basis of the allele frequencies of sites along the segment (Supplementary Note 4). Scores were higher for within-AJ segments than for within-FL or AJ–FL segments (Supplementary Fig. 4). In addition, most non-AJ sharing was concentrated in a handful of peaks (Supplementary Note 4), suggesting that many of the non-AJ detected segments were false positives.

Coverage of the genome by IBD segments

To create Fig. 2b, we considered sharing within-AJ (using all 128 individuals) and within-Europeans (FL or CEU from the 1000 Genomes Project) separately. For each hypothetical reference panel size n, we created a subset of size n of the full panel. For each individual in the subset, we computed the fraction of the genome (in physical distance) shared between that individual and the rest of the subset (which implies sharing of at least one of the haplotypes, but not necessarily both). We then averaged over all individuals in the subset and over 50 random subsets. The coverage curve was fitted to the expectation from a simple model of a bottleneck lasting a single generation, with the population size being extremely large otherwise (Supplementary Note 4).

Demographic inference using IBD segments

We used the method developed in ref. 14. For each segment length bin, we summed the total length (in cM) of segments having length in the bin and divided by the total genome size and by the total number of (haplotype) pairs. The resulting curve (Fig. 3c) was fitted (by a grid search, minimizing the sum of squared (log-) errors) to a bottleneck and expansion model, with theoretical curves computed as in ref. 14. The constant population size estimator was computed as in ref. 21. The fitting error around the optimal parameters (Supplementary Fig. 11) showed deep minima around the optimal bottleneck time and population size, but less confidence in the values of the ancestral population size and the growth rate. Confidence intervals were obtained using jackknifing (Supplementary Table 6; Supplementary Note 4). Parametric bootstrap gave qualitatively similar results.

Imputation accuracy using the AJ panel

We split the 57 AJ genomes of the first batch (here phased using a variation of SHAPEIT that employs molecular phasing information (Supplementary Note 2)) into a reference panel (n=50) and a study panel (n=7). We reduced the study panel sequences to SNPs typically genotyped on an Illumina Human Omni1-Quad array, and supplemented them with 1000 SNP arrays of AJ controls from a Schizophrenia study11,48, to emulate a typical imputation scenario. After standard QC procedures (Supplementary Note 5), we phased the entire study panel (n=1007) using SHAPEIT. We then imputed the study panel, on the basis of the AJ reference panel, using IMPUTE2 (ref. 64). We also imputed using the CEU reference panel from 1000 Genomes (n=87, larger than the AJ panel). We carried out all analyses on chr1 only (Supplementary Note 5).

Imputation accuracy was measured by uncovering the full sequences of the AJ study genomes (Supplementary Table 4). Sites not imputed by the CEU panel were set as homozygous reference, and sites imputed by the CEU panel that were not found in the AJ sequences were (conservatively) discarded (Supplementary Note 5). Monomorphic non-reference sites in the AJ panel were also discarded. The squared correlation coefficient, r2, was computed between the aggregate of all true genotypes (over all sites and study individuals) and all imputed dosages. Due to our small study panel, we computed the minor allele frequency (plotted in Fig. 2c and Supplementary Fig. 6) in the AJ reference panel (n=50). We excluded variants with frequency zero from these plots (leaving finally ≈200K variants per individual), since they are necessarily wrongly imputed. They were not removed from the overall accuracy reports (Supplementary Table 4).

Demographic inference using the allele frequency spectrum

We inferred the parameters of demographic models using ∂a∂i26. For all models, we used a mutation rate of 1.44 × 10−8 per bp per generation30 (based on the time of the human settlement in the Americas) and set the genome length to 2.685 × 109 (autosomal hg19, excluding sequence gaps) times 0.81, which is an estimate of the fraction of variants remaining after cleaning (Supplementary Note 6). We estimated the scaled mutation rate, θ, by matching the number of segregating sites. The generation time we used was 25 years. We inferred single-population models using the individual AJ and FL spectra as well as two-population models using the joint spectrum (downsampled to 50 × 50 haploid genomes). In each case, the spectrum was fitted, using ∂a∂i, with parameters as recommended by the authors (Supplementary Note 6). For each model, we experimented with different parameter regions until identifying a plausible parameter set. We then initiated the parameters to randomly perturbed values around that set. We repeated optimization with 100 different initial conditions and reported the most likely parameters. We verified that these parameters were not close to the optimization boundaries and not sensitive to the initial perturbation.

Parametric bootstrap was carried out by simulating (using MaCS65, a coalescent simulator) artificial genomes under the demographic model of the most likely parameter set. For each of 100 data sets, the allele frequency spectrum was computed and folded, and ∂a∂i was used to infer the demographic parameters, exactly as for the real data. The biased-corrected 95% confidence intervals were computed assuming a normal distribution of the inferred parameters (Supplementary Note 6). Note that the confidence intervals account only for sampling noise but not for systematic errors such as sequencing errors or model and mutation rate misspecification.

For the single-population case (Supplementary Note 6, Supplementary Fig. 7 and Supplementary Table 5), we found that a model of a bottleneck followed by exponential growth explains well the spectra of both populations (Supplementary Fig. 8). Our main two-population model is shown in Fig. 4. The parameters of the recent AJ bottleneck were fixed to the values inferred from the IBD analysis (Supplementary Table 6). We verified that the log-likelihood of the optimal model decreased sharply near the values of two key parameters: the fraction of European admixture into AJ and the time of the European–Middle Eastern divergence. Admixture into AJ was shown to be necessary for a reasonable fit (Supplementary Note 6). Most parameters were robust to model specification, specifically, the time of the out-of-Africa bottleneck, the fraction of European admixture into AJ, and to some extent, the European–Middle Eastern divergence time. The time of the European admixture, however, differed considerably between models (Supplementary Note 6). The most promising model refinement included an additional wave of migration from the ancestral Middle Eastern population into Europeans at about ≈17 Kyr; experiments with further refinements did not converge to a consistent parameter set (Supplementary Note 6).

The deleterious mutation load

We annotated coding variants in the merged and size-matched AJ–FL data set (n=26 × 2) using the SeattleSeq Variant Annotation server. We measured the (non-reference) variant load either as unique or total counts, and either for all or low frequency only variants (Supplementary Note 7). We further broke the counts by whether the variants were non-coding, coding synonymous or coding non-synonymous, and by PolyPhen’s66 predicted effect (damaging or benign). To account for the genome wide larger number of variants in AJ, we normalized all counts by the ratio between the number of neutral AJ and FL variants. Significance of AJ–FL differences in any category was evaluated by assuming that all counts were binomial (Supplementary Table 8; Supplementary Note 7). To compare the number of non-synonymous variants per individual (Supplementary Fig. 15), we normalized each count by the number of intergenic variants. The (genome wide) average GERP score over all non-reference variants in each individual67 was slightly higher (more conserved) in AJ than in FL (Supplementary Note 7).

We also attempted to determine whether there was any disease category with particularly high mutational burden in AJ. We computed the total number (over all individuals in each population) of non-synonymous (non-reference) variants in all genes belonging to each disease category, using the annotation developed in ref. 47 and then by Omicia (assigning 2488 genes into 17 categories; Supplementary Table 9). We then ranked all genes according to the difference between the number of AJ and FL non-synonymous variants, and used GSEA68 to determine whether any given category had an exceptional number of top ranked genes. Only the aging category reached P<0.05, but with false discovery rate >0.05 (Supplementary Note 7).

A catalogue of variants in known disease genes

Our list of AJ disease genes is based on a table from ref. 2. We determined the hg19 coordinates of all disease mutations in that table manually using a number of online resources (Supplementary Note 7). The final list of 73 mutations in 48 genes is reported in Supplementary Data 4, along with some properties of each mutation. We then extracted all variants (including non-SNVs) in these genes from our unfiltered AJ genotypes (n=128). We detected carriers of 35 known disease mutations in 29 genes and annotated 953 newly discovered variants (using ANNOVAR60; also reported in Supplementary Data 4, along with summary statistics per gene; Supplementary Note 7).