The mutational burden of aging As people age, they accumulate somatic mutations in healthy cells. About 25% of cells in normal, sun-exposed skin harbor cancer driver mutations. What about tissues not exposed to powerful mutagens like ultraviolet light? Martincorena et al. performed targeted gene sequencing of normal esophageal epithelium from nine human donors of varying age (see the Perspective by Chanock). The mutation rate was lower in esophagus than in skin, but there was a strong positive selection of clones carrying mutations in 14 cancer-associated genes. By middle age, more than half of the esophageal epithelium was colonized by mutant clones. Interestingly, mutations in the cancer driver gene NOTCH1 were more common in normal esophageal epithelium than in esophageal cancer. Science, this issue p. 911; see also p. 893

Abstract The extent to which cells in normal tissues accumulate mutations throughout life is poorly understood. Some mutant cells expand into clones that can be detected by genome sequencing. We mapped mutant clones in normal esophageal epithelium from nine donors (age range, 20 to 75 years). Somatic mutations accumulated with age and were caused mainly by intrinsic mutational processes. We found strong positive selection of clones carrying mutations in 14 cancer genes, with tens to hundreds of clones per square centimeter. In middle-aged and elderly donors, clones with cancer-associated mutations covered much of the epithelium, with NOTCH1 and TP53 mutations affecting 12 to 80% and 2 to 37% of cells, respectively. Unexpectedly, the prevalence of NOTCH1 mutations in normal esophagus was several times higher than in esophageal cancers. These findings have implications for our understanding of cancer and aging.

Somatic mutations occur in healthy cells throughout life (1–3). Most of these mutations do not alter cell behavior and accumulate passively (4). Occasionally, however, a key gene is altered in a way that provides mutant cells with a competitive advantage, leading to the formation of persistent mutant clones. Such clones are thought to be the origin of cancer and have also been linked to other diseases (5, 6). Despite the importance of somatic mutation, understanding its extent in normal tissues has been challenging because of the difficulties of identifying mutations present in small numbers of cells.

The most highly mutated normal tissue reported to date is Sun-exposed human skin. Deep targeted sequencing of Sun-exposed skin from four middle-aged individuals revealed large numbers of mutant clones under positive selection, with around a quarter of skin cells carrying cancer-driving mutations (7). As most mutations were caused by ultraviolet (UV) light, it is unclear whether aged Sun-exposed skin represents a special case due to a lifetime of exposure to a powerful mutagen. This question motivated us to investigate the mutational landscape of esophageal epithelium, a tissue with a similar structure but very different exposure to mutagens. Like the skin, esophageal epithelium consists of layers of keratinocytes. Cells are shed from the surface throughout life and are replaced by proliferation. In addition, both the skin and the upper and mid-esophagus develop squamous cell cancers.

We performed ultradeep targeted sequencing of 844 small samples of normal esophageal epithelium from nine deceased organ transplant donors, ranging in age from 20 to 75 years (table S1). None of the donors had a known history of esophageal or other chronic disease, and none were taking prescription medication for gastroesophageal reflux. Four of the nine donors had a history of cigarette smoking. Upper and mid-esophageal epithelium was separated from the underlying stroma and cut into a contiguous grid of 2-mm2 samples, allowing us to map clones that spanned multiple samples (methods S1). Each sample was examined under a dissecting microscope, and no lesions were seen. Histology and whole-mount confocal imaging of adjacent tissue were also normal (figs. S1 and S2). Deep targeted sequencing of 74 cancer genes was performed on each sample to a median on-target coverage after duplicate removal of 870× (methods S2). Twenty-one samples that were found to be dominated by large clones from the targeted sequencing data were also whole-genome sequenced to a median coverage of 37×. This captures the state of the genome of the cell whose progeny subsequently colonized the sample.

Detection of mutations in normal esophagus To detect mutations present in only a small fraction of each sample from deep targeted sequencing data, we used the ShearwaterML algorithm (7, 8). This algorithm uses the observed error rates per site from a large collection of normal samples to build a site-specific error model for every type of change in every targeted site (methods S3 and fig. S3). In this dataset, we identified 8919 somatic coding mutations across 844 samples from all donors (total area, ~17 cm2). Of these, 6935 were considered to be independent events after mutations shared by nearby samples were merged into single clones (methods S3.3 and table S2). Most sites in the genome display error rates below 1 × 10−4 errors per base, which enables accurate identification of mutations at low allele frequencies (methods S3.2 and figs. S3 and S4). The median allele frequency of the mutations detected by ShearwaterML was 1.6%, with a third of all mutations below 1% (fig. S3). As the fraction of sequencing reads that carry a mutation is a function of the fraction of mutant cells within a sample and of the local copy number, we can integrate allele frequencies and sample areas to estimate the sizes of detectable mutant clones in normal esophagus, which ranged from 0.01 mm2 to more than 8 mm2 (methods S5). The number of mutations identified per sample and their allele frequencies varied markedly across individuals, with both the number of detectable mutations and the sizes of mutant clones roughly increasing with donor age (Fig. 1, A and B). To better understand the passive rate of accumulation of mutations in healthy esophagus, we can estimate the mean number of mutations per cell in each individual by integrating allele frequencies (methods S5) (7). These are conservative lower-bound estimates, as they are limited to mutations present in detectable clones. On average, healthy cells in the esophageal epithelium carry at least several hundred mutations per cell in people in their 20s, rising to over 2000 mutations per cell late in life (Fig. 1C). Similar estimates were obtained from the whole-genome sequencing data (methods S5.1). These estimates of the mutation rate in normal esophagus are broadly comparable to the mutation rates reported for human stem cells of the colon, small intestine, and liver by sequencing of clonal organoids (9). Fig. 1 Detection of somatic mutations in normal esophagus. (A) Number of mutations detected per sample across the 844 samples from the nine transplant donors (sorted by age). Donor age is shown in 4-year bins to increase sample anonymity. (B) Variant allele fractions (VAFs) for the mutations detected in the youngest and oldest donors, colored by mutation type. The VAF is the fraction of sequencing reads reporting a mutation within a sample. (C) Scatter plot of donor age and the estimated mean mutation burden per cell for each donor. The fitted line, R2 value, and P value were obtained by linear regression.

Widespread positive selection driving clonal growth In middle-aged individuals, the number of mutations per cell in normal esophagus is about that in Sun-exposed skin (7), a difference due partially to the high degree of UV damage sustained by the skin. Given this, we anticipated that the frequency of cancer-driver mutations in esophagus would be much lower than that in skin. Unexpectedly, however, analysis of the frequency and size of mutant clones revealed a higher density of cancer-associated mutations in normal esophagus than in Sun-exposed skin, suggesting stronger positive selection of clones with mutations in cancer-associated genes. To formally quantify the extent of selection driving clonal expansions in normal esophagus, we estimated the ratio of nonsynonymous to synonymous mutation rates (dN/dS) across genes, which is a widely used measure of selection. We used the dNdScv model, an implementation of dN/dS for somatic data that controls for trinucleotide mutational signatures, sequence composition, and variable mutation rates across genes (4) (methods S6). This method has been shown to reliably identify genes under positive selection in cancer and normal tissues (4, 7). In the context of this experiment, dN/dS ratios reveal how much more (or less) likely it is for a nonsynonymous mutation to reach a detectable clone size than a synonymous mutation (methods S6.2). This analysis revealed strong evidence of selection driving clonal expansions in normal esophagus. At the gene level, we detected significant positive selection in 14 of the genes that we sequenced (Fig. 2, A to D, and table S3). This means that mutation of these genes confers a competitive advantage on mutant cells relative to neighboring cells. Sorted by mutation frequency, the list comprises NOTCH1, TP53, NOTCH2, FAT1, NOTCH3, ARID1A, KMT2D, CUL3, AJUBA, PIK3CA, ARID2, TP63, NFE2L2, and CCND1. Notably, the five most frequently mutated genes in normal esophagus also dominated the mutational landscape in Sun-exposed skin. Many of the positively selected genes play a role in keratinocyte differentiation through NOTCH signaling [NOTCH1, NOTCH3, and TP53 (10–13)], through redox cellular stress [NFE2L2, TP63, and CUL3 (14–17)], or through epigenetic regulation [KMT2D (18)]. Tilting cell fate balance away from differentiation toward proliferation may confer a competitive advantage on mutant cells in normal esophageal epithelium (19). Fig. 2 Widespread positive selection of cancer-associated mutations in normal esophagus. (A) Number of mutations detected in each of the 14 genes found under positive selection. (B) Observed-to-expected ratios for missense substitutions, truncating (nonsense and essential splice site) substitutions, and indels. Observed-to-expected ratios for substitutions are dN/dS ratios. Only ratios with P < 0.05 are shown. (C) Estimated percentage of cells carrying a mutation in each gene (methods S5.3). (D) Percentage of ESCCs with a nonsynonymous substitution or an indel in each gene. Error bars depict 95% Poisson CIs. (E) Distribution of mutations within TP53 and NOTCH1 in normal esophagus (above the gene domain diagram) and in SCC cancers from The Cancer Genome Atlas (below). The region of EGF8 to EGF12 is boxed. aa, amino acids. (F) Consequences of NOTCH1 missense mutations. (Top) Most NOTCH1 missense mutations affect structural residues in EGF domains (shown in stick form) [Protein Data Bank (PDB) code 2VJ3]: calcium-binding consensus residues (red); hydrophobic interdomain packing residues (teal); cysteine residues, which form disulfide bonds (yellow); and conserved glycines (black). Calcium ions are shown as red spheres. (Bottom) Other residues affected by missense mutations (≥4 per residue) in the EGF8-to-EGF12 region are shown in space-filling representation. Many are predicted to disrupt the Notch receptor-ligand binding interface (shown in deep blue and labeled by residue number), whereas others are distal (colored wheat) (PDB code 5UK5). Single-letter abbreviations for the amino acid residues are as follows: E, Glu; I, Ile; N, Asn; P, Pro; R, Arg; and V, Val. (G) Estimated percentage of mutant epithelium per donor compared with ESCC mutation frequency. (H) dN/dS values estimated from all 74 target genes together in normal esophagus and Sun-exposed skin (7). Error bars depict 95% CIs. At least 11 of the 14 genes found under positive selection in normal esophagus are canonical drivers of esophageal squamous cell carcinomas (ESCCs) (methods S6.5) (20–22). Their presence in normal epithelium suggests that they act as early ESCC drivers, leading to the expansion of persisting clones that may undergo further mutation and malignant transformation. The landscape of selection in normal squamous epithelium of the esophagus more closely resembles that of ESCCs than that of esophageal adenocarcinomas (EACs) (21), consistent with the typical development of ESCCs from the squamous epithelium of the upper and mid-esophagus. By contrast, EACs evolve from epithelium close to the stomach junction and are associated with Barrett’s metaplasia.

Colonization of the epithelium by NOTCH1 mutant clones One unexpected observation was the very high prevalence of NOTCH1 mutations in normal esophagus (Fig. 2, A to C). Across the nine donors, we detected 2055 coding mutations in NOTCH1, of which more than 98% were nonsynonymous, with an average of ~120 different NOTCH1 mutations per square centimeter of normal esophagus (Fig. 2A). NOTCH1 acts as an oncogene in different leukemias but has a mutation pattern consistent with a tumor suppressor gene in squamous cell carcinomas (SCCs) of the skin, head and neck, esophagus, and lung (23). As in SCCs, mutations in NOTCH1 in normal esophagus were enriched for truncating mutations (dN/dS > 50), including stop-gains, essential splice site mutations, and indels (Fig. 2B). Missense mutations were also frequent in NOTCH1, and they were concentrated in 5 of the 36 extracellular epidermal growth factor (EGF) repeat domains, EGF8 to EGF12 (Fig. 2E). These EGF repeats contain the binding domains for the Notch1 ligands Jagged and Delta. The most recurrent codon alterations occurred at sites predicted to affect structural residues (calcium-binding motifs, cysteine residues, and interdomain packing residues) or the contact surface with Notch1 ligands (Fig. 2F and supplementary materials) (23, 24). The large number of positively selected NOTCH1 mutations provides structural and functional insights into this key regulatory protein. By integrating the allele fractions of the mutations and allowing for the possibility that mutations may affect one or two alleles per cell, we can estimate the fraction of mutant cells in a tissue for any given gene (methods S5.3). On average across the nine donors, 25 to 42% of the cells in normal esophagus harbored NOTCH1 mutations (Fig. 2C). The frequency of NOTCH1 mutant clones showed a large increase with age. About 30 to 80% of the cells in normal esophagus were NOTCH1 mutated in five of the six middle-aged or elderly individuals, compared with 1 to 6% in the three individuals under 40 years of age (Fig. 2G). This observation is consistent with data from experimental mouse models showing that transgenic inhibition of Notch signaling in a small fraction of cells confers clonal advantage and enables these clones to colonize the normal esophageal epithelium (19, 25). This observation has potentially important implications. The NOTCH1 gene has been widely assumed to be a driver in ESCCs because it is mutated in ~10% of tumors (21, 26) (Fig. 2, D and G). The observation that, in middle-aged individuals, NOTCH1 is typically mutated in 30 to 80% of the normal esophageal epithelium suggests that NOTCH1 mutations are less frequent in cancers than in the background of normal tissue from which the cancers develop. This raises questions about the role of NOTCH1 in the development of ESCCs. The case of NOTCH1 contrasts with that of TP53 (Fig. 2G), which is mutated in more than 90% of ESCCs but in a minority of cells in the normal esophageal epithelium. TP53 is the second most frequently mutated gene in normal esophagus, with ~35 mutations per square centimeter and strong positive selection for both truncating and missense mutations (dN/dS ratios ~150 and ~50, respectively) (Fig. 2, A and B). As in cancer genomes, the missense mutations affect mostly the central DNA binding domain (Fig. 2E). Across the nine donors, 5 to 10% of the epithelium carried a TP53 mutation, a fraction that appeared to increase with age, with the oldest donor having TP53 mutations in 20 to 35% of cells (Fig. 2G). In summary, we found an unexpectedly high density of driver mutations in normal esophagus and positive selection acting on most of the main drivers of ESCC. By combining the 74 genes studied, global dN/dS ratios for missense and protein-truncating (nonsense and essential splice site) mutations were ~2.2 and ~8.6, respectively, with the enrichment of nonsynonymous mutations increasing rapidly with clone size (Fig. 2H and fig. S5B). This suggests that approximately 55% of all missense mutations and 88% of all truncating mutations identified in this dataset were actively driven to detectable clone sizes by positive clonal selection. Overall, by using dN/dS ratios and considering substitutions and indels in the 14 genes under significant selection, we estimate that there are 3915 [95% confidence interval (CI), 3829 to 3988) positively selected driver mutations in the ~17 cm2 of normal esophageal epithelium sequenced in this study, of which 52% are in genes other than NOTCH1 (methods S6.3). This number is comparable to the yield of driver mutations obtained from sequencing more than 1000 cancer genomes (4).

Variation of the mutational and selective landscape across donors The patterns of somatic evolution varied greatly across the nine individuals in this study, with large differences in mutation density, clone sizes, and overall driver frequency (Fig. 3). Age is by far the strongest risk factor in ESCC, with cancer incidence rising near-geometrically with age (27, 28). We used mixed-effect regression models to evaluate the association between the mutation landscape and age while controlling for other risk factors, such as gender and smoking status (methods S7). Despite the modest cohort size, this analysis revealed a significant increase in the number of mutations per sample (P = 0.009) and clone sizes (P = 0.027) with age. This is consistent with the significant increase in the mutation burden with age depicted in Fig. 1C and determined by standard linear regression (P = 0.0068; coefficient of determination R2 = 0.67). We also noted that the two heavy smokers in the cohort could have a higher number of mutations than expected for their age (Figs. 1A and 3 and methods S7). However, larger cohorts will be needed to reliably study the effects of behavioral risk factors on the mutational landscape in the esophagus. Fig. 3 Variation of the mutational landscape across the nine donors. Representative patchwork plots from each donor. Each panel is a schematic representation of the mutant clones in an average 1-cm2 area of normal esophageal epithelium from each donor. To generate each figure, a number of samples from the donor were randomly selected to amount to 1 cm2 of tissue, and all clones detected are represented as circles randomly distributed in space. The density and size of the clones are inferred from the sequencing data, and the nesting of clones and subclones is inferred from the data when possible and randomly allocated otherwise (methods S5.4). Despite the dominant effect of age, we found unexplained differences across individuals, including differences in the strength of selection on different genes across individuals, as suggested by the differently colored clones in Fig. 3. To formally quantify differences in selection pressure per gene across donors while removing the effect of variable mutation rates and signatures across individuals, we used an extension of dNdScv that compares two dN/dS ratios (methods S6.4). This confirmed significant differences in the driver landscape across donors (fig. S5, C to E). For example, across individuals, NOTCH1 is mutated five times as frequently as NOTCH3. Yet, in one donor, we detected nearly the same number of mutations in NOTCH1 and NOTCH3 (fig. S5D) (q = 5 × 10−13, likelihood-ratio test). Similarly, the oldest donor showed a twofold relative enrichment in TP53 mutations compared with other individuals (q < 1 × 10−15, likelihood-ratio test) (fig. S5E), consistent with the observation that 20 to 37% of normal esophageal epithelium was TP53 mutated in this donor (Fig. 2G). Whether the variation in the driver landscape across donors reflects differences in exposure to environmental factors, the genetic background of each individual, or both is unclear. Nevertheless, differences in mutation rates, clone sizes, and driver preferences may have implications for understanding interindividual variation in cancer risk. Given the large increase in driver mutant clones with age, many clones are expected to acquire more than one driver mutation over the course of a lifetime. Although the small clone sizes limit our ability to determine which mutations within a sample occur in the same cells, 25 samples had sufficiently large clones for us to confidently group mutations (methods S5.5) (Fig. 4A and fig. S6). Most cases (14 of 25) were examples of NOTCH1 biallelic inactivation by two mutations. We also observed examples of clones carrying mutations in NOTCH1 and FAT1, NOTCH1 and NOTCH3, and PIK3CA and NOTCH3. In the oldest donor (in the age range from 72 to 75 years), whose samples showed an enrichment of TP53 mutations, we found a large clone, measuring >4 mm2, with a founder heterozygous TP53 mutation and three separate subclones each carrying a different second TP53 mutation (Fig. 4A). For a large clone extending over six samples and measuring >8.5 mm2, we were able to integrate whole-genome data and spatial information to reconstruct the clone’s phylogenetic history (methods S5.6). The tree shows that the ancestor cell underwent a large clonal expansion after losing both copies of NOTCH1, followed by branching evolution with two subclones dominating spatially distinct areas (Fig. 4B). Fig. 4 Phylogenetic and mutational patterns in normal esophagus. (A) Representation of mutations co-occurring in the same clones by using the pigeonhole principle (see supplementary materials 5.5). (B) Phylogenetic reconstruction of the evolution of a large clone overlapping six samples by using whole-genome sequencing data and spatial information. A small heatmap of the six affected samples is shown next to each node in the tree, depicting the mean VAF for the mutations in each node. Single-letter abbreviations for the amino acid residues are as follows: C, Cys; D, Asp; G, Gly; M, Met; Q, Gln; S, Ser; T, Thr; W, Trp; and Y, Tyr. (C) Number of substitutions per mutation type as mapped to the coding (untranscribed) strand from all donors. P values reflect transcription strand asymmetry (exact Poisson test). (D) Ninety-six–mutation–class bar plot depicting the number of mutations in each of the possible 96 trinucleotides (strand independent). (Top) Whole-genome plot aggregating all 21 whole genomes. (Bottom) Spectrum for mutations occurring in the transcribed region of the top 20% most highly expressed genes. (E) Mutation burden in normal esophagus and in ESCC and EAC tumors (every point corresponds to a donor, sorted by mutation burden). AML, acute myeloid leukemia. (F) Number of copy number events detected in each gene across the 844 samples by using the targeted data. (G) Representative log R ratio and B-allele frequency (BAF) scatter plots for heterozygous SNPs from whole-genome data showing a copy-neutral LOH event affecting NOTCH1 (sample PD30273bg is shown). Chr 1, chromosome 1.

The whole-genome mutational landscape in normal esophagus To better understand the contribution of different mutational processes and the extent of structural variation in normal esophagus, we performed whole-genome sequencing of 21 samples dominated by a major clone. Across all donors, C→T or G→A (C>T/G>A) mutations dominate the spectra, with a clear excess of mutations at CpG dinucleotides (Fig. 4, C and D, and fig. S7). These changes result from the deamination of 5-methylcytosine into thymine and are believed to occur spontaneously throughout life (29, 30). Signature analysis revealed that the pattern of mutations largely resembles a combination of COSMIC (Catalogue of Somatic Mutations in Cancer) mutational signatures 1 and 5 (30) (methods S4 and figs. S7 and S8). Both signatures have been shown to dominate the accumulation of mutations in normal tissues such as colon, small intestine, and liver during life (9). In addition, we observed two other mutational processes. There was a considerable rate of C>A/G>T changes with a modest but significant transcription-strand bias. Multiple mechanisms can lead to these types of changes, including smoking. Although four of the nine individuals were smokers, we did not observe a clear signature of tobacco-induced mutations (COSMIC signature 4) (methods S4). We also observed considerable variation in T>C changes across the 21 whole genomes, with a strong transcription-strand asymmetry (Fig. 4D and figs. S7 and S8). Stratification of the mutation spectra by gene expression level revealed that highly transcribed genes are targets of a process of transcription-coupled mutagenesis that induces T>C changes preferentially at ApT sites in the transcribed strand, a phenomenon previously described in liver cancers (31) (related to COSMIC signature 16) (Fig. 4D and fig. S8). Overall, most mutations seem to be generated by intrinsic mutational processes associated with age or transcription (30, 31), without clear evidence of external mutagenic processes. We found no evidence of COSMIC signatures 2 and 13 in the targeted or the whole-genome data. These signatures are believed to be caused by APOBEC (apolipoprotein B mRNA editing enzyme, catalytic polypeptide–like) cytidine deaminases and contribute large numbers of mutations in esophageal cancers (20, 21, 32). This partially explains the observation that the mutation burden in normal esophagus is about an order of magnitude lower than the median mutation burden of ESCC and EAC cancers (Fig. 4E). The rarity of APOBEC mutagenesis in normal esophagus may suggest that this is acquired later in the evolution of ESCC or that ESCCs are more likely to evolve from rare clones displaying APOBEC mutagenesis. Esophageal cancers are characterized by large numbers of copy number changes and structural rearrangements (21, 33). To explore the extent of copy number changes in normal esophagus, we first analyzed the deep targeted sequencing data. We used a copy number detection algorithm designed to identify low-frequency subclonal loss of heterozygosity (LOH) on targeted data, exploiting the statistical phasing of heterozygous single-nucleotide polymorphisms (SNPs) to detect small allelic imbalances (7) (methods S3.4). NOTCH1 loss was the most frequent copy number change identified, although caution must be exercised because statistical power varies across genes and donors. NOTCH1 LOH was detected in nearly 30% of all samples (Fig. 4F) and in virtually all of the samples with a single high-frequency NOTCH1 mutation, confirming that the loss of NOTCH1 is typically biallelic. PTCH1 sits on the same arm of chromosome 9 as NOTCH1 and is often lost together with NOTCH1. We also detected less frequent but recurrent whole–chromosome 3 gains, which lead to the duplication of PIK3CA/SOX2/TP63, an event observed in approximately half of ESCCs (21) (Fig. 4F). Several instances of TP53 LOH were also detected, largely concentrated in the oldest donor. Copy number analysis of the 21 whole genomes confirmed that segmental loss of NOTCH1 is typically mediated by copy-neutral LOH without detectable rearrangements (Fig. 4G, fig. S9, methods S3.5.2, and table S4). Such events may be generated by mitotic homologous recombination. Events varied in size, from whole-arm losses to focal events (Fig. 4G and figs. S9 and S10). With the exception of copy-neutral LOH events in NOTCH1 and an instance of chromosome 3 gain, the 21 genomes appeared largely diploid, without evidence of other copy number changes that may be expected to accumulate by chance over time (Fig. 4G, fig. S9, methods S3.5.2, and table S4). The rarity of copy number changes in large clones, none of which had TP53 mutations, suggests that the background rate of copy number changes is low in normal cells of the esophagus or that such changes are negatively selected. Either way, this represents a major difference between normal esophageal cells and ESCCs, suggesting that structural changes may occur late in the evolution of esophageal cancers (33).

Discussion These data have unveiled a hidden world of somatic mutation and clonal competition in normal esophagus. We have detected thousands of mutations per cell, hundreds of positively selected clones per square centimeter, and clones with cancer-associated mutations colonizing most of the esophageal epithelium with age, all without grossly detectable changes in histology. The higher frequency of cancer-associated mutations in normal esophagus than in Sun-exposed skin is unexpected, particularly given the lower mutation rate in the esophagus. Although we found most of the common drivers of ESCC already under selection in normal esophageal epithelium, key differences remain between the genomes of cells in mutant clones in aging normal epithelium and those of cancer cells. These include a mutation burden in normal epithelium about that in many ESCCs, no evidence of APOBEC mutagenesis, and an apparent lack of chromosomal instability. Further, although clones carrying cancer-driver mutations are widespread, the average number of driver mutations per cell in normal esophagus is much lower than that in cancer cells (Fig. 2C), a result consistent with the multistage theory of carcinogenesis (27, 28, 34). Larger-scale genomic studies of normal tissues in healthy individuals and of premalignant lesions of different grades will help refine our understanding of the transition from normal cells to cancer (3, 28, 33, 35). An unexpected observation is the high frequency of NOTCH1 mutation in aged normal esophagus compared with ESCCs. This may suggest that ESCCs are more likely to evolve from cells in the epithelium without NOTCH1 mutations. By contrast, TP53 mutations, which are less frequent than NOTCH1 mutations, are almost ubiquitous in ESCCs, suggesting that cancers arise from the small fraction of TP53 mutant cells. Cancer risk may therefore vary across the aging epithelium, depending on the colonizing mutations present. Interventions that decrease the proportion of mutant cells at a higher risk of transformation in normal epithelium may thus be beneficial. We note that, even if they do not contribute to carcinogenesis, drivers of benign clonal expansions may still appear as recurrently mutated genes in cancer genomes, owing to their high mutation frequency in the normal cells from which tumors evolve. Better understanding of the mutational landscape in normal tissues may thus help refine current catalogs of cancer-driver genes, with important implications for early diagnosis and targeted therapy. Positive selection of mutant clones has now been observed during normal aging in blood, Sun-exposed skin, and esophageal epithelium (7, 36). This opens the theoretical possibility of clonal selection across tissues as a contributing factor in tissue and organismal aging (4, 37, 38). Somatic mutation has long been recognized as a possible factor contributing to aging, with mutations and other forms of damage deleterious to the carrying cells passively accumulating during life and progressively reducing cellular fitness (39). Widespread positive selection of mutant clones may be an additional contributory factor in aging, as it can greatly accelerate the accumulation of functional mutations and altered phenotypes. Throughout life, somatic mutations increasing cellular fitness can spread and even dominate tissues, independently of their cost to the organism. If the selected mutations negatively affect tissue function, the physiological integrity of the organism will decline, a hallmark of the aging process. This study emphasizes how little we know about somatic evolution within normal tissues, a fundamental process that is likely to take place to varying degrees in every tissue of every species. Better understanding of the extent of somatic mutation and selection across tissues in health and disease promises to provide insights into the origins of cancer and aging.

Supplementary Materials www.sciencemag.org/content/362/6417/911/suppl/DC1 Materials and Methods Figs. S1 to S10 Tables S1 to S4 References (40–49)

http://www.sciencemag.org/about/science-licenses-journal-article-reuse This is an article distributed under the terms of the Science Journals Default License.

Acknowledgments: We are very grateful to the families of deceased donors for their consent and to the Cambridge Biorepository for Translational Medicine for access to human tissue. Funding: I.M. is funded by Cancer Research UK (C57387/A21777). P.J.C. is a Wellcome Trust Senior Clinical Fellow. This work was funded by a Cancer Research UK program grant to P.H.J. (C609/A17257), an MRC Centenary grant, Wellcome Trust core funding to the Wellcome Sanger Institute, and an MRC grant-in-aid to the MRC Cancer Unit. Author contributions: P.H.J. initiated the project. P.H.J. and J.C.F. designed the experiments. I.M. led data analysis with help from A.R.J.L., F.A., A.C., and M.W.J.H. and advice from M.R.S. and P.J.C. P.A.H. analyzed the structural implications of NOTCH1 mutations. K.S.-P. and K.Ma. collected the samples. J.C.F., A.W., and K.Mu. performed experiments. R.C.F. contributed to a pilot study. I.M., J.C.F., and P.H.J. wrote the paper. Competing interests: M.R.S. is on the scientific advisory board of GRAIL. The other authors declare no competing interests. Data and materials availability: Sequencing data are deposited in the European Genome-phenome Archive (EGA) under accession numbers EGAD00001004158 and EGAD00001004159.