Mutational Landscape in the Patients

We analyzed the results of exome sequencing of DNA from granulocytes and constitutional DNA obtained from purified T cells or buccal cells in 168 patients with myeloproliferative neoplasms. The identification of appropriate constitutional DNA samples is a challenge among patients with myeloproliferative neoplasms, since circulating T cells and buccal cells may be contaminated by neoplastic cells (Fig. S1 in the Supplementary Appendix). For patients with the JAK2 V617F mutation, we measured the JAK2 V617F allele burden, using a quantitative polymerase-chain-reaction (PCR) assay to identify contaminated constitutional DNA samples, with in vitro T-cell cultures being necessary in some instances. In cases in which constitutional DNA was limited, the use of a combination of unamplified and whole-genome amplified DNA improved exome coverage (Fig. S2 in the Supplementary Appendix). We developed a bioinformatics workflow to enable the identification of high-confidence somatic variants (Fig. S3 in the Supplementary Appendix), and complemented this with orthogonal verification of mutations by targeted resequencing and manual curation.

Of the 168 samples of myeloproliferative neoplasms that were sequenced, 151 samples (48 samples of polycythemia vera, 62 of essential thrombocythemia, 39 of myelofibrosis, and 2 of unclassifiable myeloproliferative neoplasms) fulfilled the bioinformatic criteria and were taken forward for analysis (Fig. S3 and Table S1 in the Supplementary Appendix). The average exome sequencing coverage was 141×.

Figure 1. Figure 1. Mutational Profile of 151 Myeloproliferative Neoplasms. Panel A shows the number and type of mutations identified on exome sequencing in each sample obtained from 151 patients with myeloproliferative neoplasms. These patients included 48 with polycythemia vera (PV), 62 with essential thrombocythemia (ET), 39 with myelofibrosis (MF), and 2 with unclassifiable myeloproliferative neoplasms (MPN-U). The type of mutation is indicated in the key in each panel; the circles below the graphs indicate the patients' mutational status: JAK2, MPL, or no JAK2 or MPL mutation. Also shown are the numbers of somatic mutations in recurrently mutated genes in this study as well as in genes previously reported to be mutated in myeloproliferative neoplasms, according to the type of mutation (Panel B) and subtype of myeloproliferative neoplasm (Panel C). In Panel B, the asterisks denote the significance of recurrently mutated genes (q<0.05). Indel denotes insertion or deletion mutation.

We identified and validated 1498 somatic mutations in 150 samples, with a range of 1 to 32 mutations per sample (Figure 1A, and Table S2 in the Supplementary Appendix); no mutations were identified in the sample obtained from Patient PD8641, who had essential thrombocythemia. A total of 99 of the mutations we identified were insertions or deletions (indels) and 1399 were substitutions; 1405 mutations were found in coding sequences (1130 nonsynonymous changes), 22 in essential splice sites, 46 in introns within 10 base-pairs of splice junctions, and 25 in untranslated regions. The median number of mutations per patient was 6.5 in patients with polycythemia vera, 6.5 in those with essential thrombocythemia, and 13.0 in those with myelofibrosis (P=0.008 by one-way analysis of variance). The significantly higher number of mutations in patients with myelofibrosis was consistent with the concept that this disorder is a more advanced stage of disease than either essential thrombocythemia or polycythemia vera (P<0.001 for the comparison of myelofibrosis with essential thrombocythemia and P=0.008 for the comparison of myelofibrosis with polycythemia vera; pairwise t-test with Bonferroni correction for both comparisons). The mutational spectrum showed a predominance of C→T transitions (Fig. S4 in the Supplementary Appendix) and was similar to that observed in myelodysplasia and several epithelial cancers.23,27

We identified mutations in several genes that have previously been implicated in myeloproliferative neoplasms or other myeloid cancers (Figure 1B and 1C). JAK2 V617F was the most prevalent mutation and was found in all 48 patients with polycythemia vera, 35 of 62 patients with essential thrombocythemia (56%), and 27 of 39 patients with myelofibrosis (69%). Mutations in epigenetic regulators were as follows: TET2, 25 somatic variants in 22 patients; DNMT3A, 13 somatic variants in 12 patients; ASXL1, 13 somatic variants in 12 patients; EZH2 somatic variants in 4 patients; and IDH1/2 somatic variants in 3 patients (Figure 1B). Mutations in genes encoding components of the splicing machinery were found in 9 patients (U2AF1 in 4 patients, SF3B1 in 3 patients, and SRSF2 in 2 patients), and mutations in the gene encoding the thrombopoietin receptor (MPL) were identified in 7 patients, all of whom had essential thrombocythemia or myelofibrosis with unmutated JAK2. Mutations in genes that are reported to be mutated in myeloproliferative neoplasms at low frequencies were found in 4 patients (CBL, in 1 patient; NFE2, in 2 patients; and SH2B3/LNK, in 1 patient) (Figure 1B). Missense somatic mutations in CHEK2, which have not been reported previously in myeloproliferative neoplasms, were found in 1 patient each with polycythemia vera, essential thrombocythemia, and myelofibrosis (q=0.008) (Table S3 in the Supplementary Appendix). We also assessed pairwise associations between genes mutated in myeloproliferative neoplasms (Fig. S5 in the Supplementary Appendix). Mutations in ASXL1 were found to be comutated with genes involved in RNA splicing (U2AF1 and SRSF2), and mutations in SRSF2 were comutated with genes that encode epigenetic modifiers (TET2, IDH1, and ASXL1) — a profile strikingly similar to the associations seen in myelodysplasia.28

Recurrent Somatic CALR Mutations

Figure 2. Figure 2. CALR Mutations in Myeloproliferative Neoplasms. Panel A shows the number of CALR mutations found in myeloproliferative neoplasms (polycythemia vera, PV; essential thrombocythemia, ET; or myelofibrosis, MF) that have mutated JAK2, mutated MPL, or nonmutated JAK2 or MPL. Panel B shows the results of validation on polymerase-chain-reaction (PCR) assay and Sanger sequencing of the two most common CALR variants — deletion (L367fs*46) and insertion (K385fs*47) — in patients with myeloproliferative neoplasms. The patients are indicated by their patient-identification numbers above the results of gel electrophoresis of PCR products of patients' granulocytes (Gran) and T cells. The sequencing traces show the heterozygous mutation of CALR. The shaded region on the left highlights a homologous DNA sequence flanking the common CALR deletion, and the green arrow on the right highlights the inverse tandem duplication of five bases in the common CALR insertion. Panel C shows the genomic location of CALR deletions. The numbers indicate the number of patients with each disease. Panel D shows the genomic location of CALR insertions (shown in red). aCML denotes atypical chronic myeloid leukemia, CMML chronic myelomonocytic leukemia, and MDS myelodysplastic syndromes.

A striking pattern of somatic mutations was observed in CALR, a gene that had not previously been recognized as an oncogene mutated in any form of cancer. CALR mutations were identified in 26 patients (Figure 1B and 1C). All the mutations were indels in exon 9 and had a remarkable association with disease, since they were found in 26 of 31 patients with essential thrombocythemia or myelofibrosis and nonmutated JAK2 or MPL (84%; 95% confidence interval [CI], 66 to 94) but in none of the 120 patients with JAK2 or MPL mutations (Figure 2A, and Fig. S5 in the Supplementary Appendix). There were two common variants: L367fs*46, which resulted from a 52-bp deletion flanked by 7 base pairs of identical sequence; and K385fs*47, which resulted from a 5-bp insertion and represented an inverse duplication of the five nucleotides preceding the insertion (Figure 2B).

Exome sequencing revealed low coverage at CALR (median depth, 10 reads) (Fig. S6 in the Supplementary Appendix). We therefore also performed Sanger sequencing of exon 9, which detected CALR mutations in 10 to 15% of cells (Fig. S6 in the Supplementary Appendix). This confirmed the absence of CALR exon 9 mutations in all 120 patients who had JAK2 or MPL mutations. Of 6 patients initially lacking a mutation in CALR, JAK2, and MPL on exome sequencing, 1 patient was found to have low-level mutated CALR on Sanger sequencing.

Among patients with essential thrombocythemia, those with CALR mutations, as compared with those with JAK2 mutations, presented with significantly higher platelet counts (P<0.001 by the Wilcoxon rank-sum test) and lower hemoglobin levels (P=0.02 by Student's t-test) (Table S4 in the Supplementary Appendix). Patients with CALR mutations had a significantly higher incidence of transformation from essential thrombocythemia to myelofibrosis than did those with JAK2 mutations (P=0.03 by Fisher's exact test) (Table S4 in the Supplementary Appendix). There were no significant between-group differences in rates of survival, although the number of deaths was small.

Exome sequencing showed that 146 of 151 patients with myeloproliferative neoplasms (97%) had mutations in JAK2, MPL, or CALR in a mutually exclusive manner (q<0.01) (Fig. S5 in the Supplementary Appendix). The results of bone marrow histologic analyses were reviewed in 4 of the 5 patients lacking mutations in all three genes and were consistent with a myeloproliferative neoplasm. In addition, 4 of these patients (PD9420, PD8945, PD8635, and PD7441) had clonal somatic mutations, an indication that these patients had a clonal myeloid disorder (Table S2 in the Supplementary Appendix).

Table 1. Table 1. Frequency of CALR Mutations in Samples Obtained from Patients with Myeloproliferative Neoplasms or Other Disorders and from Controls.

To further characterize CALR mutations in myeloid and other cancers, we performed Sanger sequencing of CALR exon 9 in 1397 samples (including 52 control samples) and evaluated whole-exome or whole-genome sequencing data for 502 solid tumors, 498 control samples, and 1015 cancer cell lines (Table 1). CALR mutations were present in 110 of 158 patients with myeloproliferative neoplasms lacking JAK2 or MPL mutations (70%; 95% CI, 62 to 77), including 80 of 112 patients with essential thrombocythemia (71%), 18 of 32 patients with primary myelofibrosis (56%), and 12 of 14 patients with progression of essential thrombocythemia to myelofibrosis (86%). No CALR mutations were found in 511 myeloproliferative neoplasms with JAK2 or MPL mutations. Secondary acute myeloid leukemia developed in 2 patients with CALR mutations (1 each with essential thrombocythemia and myelofibrosis, both with the K385fs*47 variant), with persistence of the CALR mutations after transformation. CALR mutations were identified in 10 of 120 patients with myelodysplastic syndromes (8%; 95% CI, 4 to 15) (Table 1). One patient each with chronic myelomonocytic leukemia and atypical chronic myeloid leukemia had CALR mutations. No mutations were found in control samples, lymphoid cancers, solid tumors, or cell lines (Table 1, and Table S5 in the Supplementary Appendix).

Overall, CALR exon 9 mutations were identified in 148 patients (Table S6 in the Supplementary Appendix). All mutations were indels with 19 distinct variants: 14 deletions, 2 insertions, and 3 complex indels (Figure 2C and 2D). In these 148 patients, variants L367fs*46 and K385fs*47 were the most common CALR mutations (in 67 patients [45%] and 61 patients [41%], respectively). The remaining 20 patients were found to have 17 unique variants. L367fs*46 was found more frequently in myelofibrosis than was K385fs*47 (P=0.009 by chi-square test).

Mutant Protein with a Novel C-Terminal

Calreticulin is a highly conserved protein with multiple reported functions. Within the endoplasmic reticulum, the protein ensures appropriate folding of newly synthesized glycoproteins and modulates calcium homeostasis.29,30 Outside the endoplasmic reticulum, calreticulin is also found in intracellular, cell-surface, and extracellular compartments, where it has been implicated in diverse biologic processes, including proliferation, apoptosis, and immunogenic cell death.31-34 Calreticulin has three main structural and functional domains: an N-terminal lectin-binding domain that is interrupted by a proline-rich P domain and an unstructured C-terminal acidic domain that contains multiple calcium-binding sites.30,35

Figure 3. Figure 3. Altered Protein Reading Frame with Novel C-Terminal Associated with CALR Mutations. Panel A shows the functional domains of CALR protein, with (from left to right) signal sequence, N domain (N-terminal), P domain (proline-rich), C domain (C-terminal), and KDEL (endoplasmic reticulum retention signal). The conservation of the affected portion of the C domain across species is depicted by shading, with black indicating conserved regions and gray indicating partially conserved regions. The range of frameshift insertion and deletion mutations in CALR exon 9 are shown, all of which result in a common +1 base-pair–altered reading frame and predict a novel C-terminal peptide sequence lacking the KDEL motif. Panel B shows the mutational spectra of CALR in samples of myeloproliferative neoplasms (MPN) and of common loss-of-function genes such as ASXL1 and TET2 in myeloproliferative neoplasms and acute myeloid leukemia (AML). The numbers of each type of mutation are indicated for each gene. Data regarding myeloproliferative neoplasms are from the exome subgroup in this study, and AML data are from the Cancer Genome Atlas.36

All the CALR indel mutations that were detected are predicted to generate mutant proteins with a novel C-terminal (Figure 3A). The extent of the C-terminal alterations vary, but all 19 distinct variants share a loss of a sequence of 27 amino acids with a concomitant gain of a novel peptide consisting of 36 amino acids. These alterations result in the loss of most of the C-terminal acidic domain and the KDEL signal. (The KDEL amino acid sequence [Lys-Asp-Glu-Leu] is present on some resident endoplasmic reticulum proteins and enables retrieval of these proteins from the Golgi apparatus back to the endoplasmic reticulum.) This loss of function raises the possibility of compromised retention or retrieval in the endoplasmic reticulum. All CALR mutations shift the reading frame by one base pair, a pattern of mutation that is very different from that observed in known tumor-suppressor genes such as TET2 and ASXL1 (Figure 3B).36,37 Such genes are affected by nonsense mutations and indels, with the latter generating +1 and +2 base-pair frameshifts that frequently cause premature protein termination. The remarkably stereotypical pattern of CALR mutations implies a strong selective pressure to generate the mutant C-terminal.

Figure 4. Figure 4. Localization of Calreticulin and Clonal Heterogeneity in Patients with CALR Mutations. Panel A shows immunoblotting of transiently transfected human embryonic kidney (HEK) 293T cells analyzed for FLAG (a polypeptide protein tag), CALR (detecting endogenous as well as transfected calreticulin), and beta-actin. The CALR deletion is the L367fs*46 variant, and the CALR insertion is the K385fs*47 variant. The CALR expression construct is shown above the immunoblots. CMV-Pr denotes CMV promoter. Panel B shows confocal photomicrographs of COS-7 cells transiently expressing FLAG-tagged CALR variants and a Golgi reporter (galactosyltransferase fused to yellow fluorescent protein [GalT-YFP]). Red indicates FLAG; green, Golgi; and blue, nucleus (4′,6-diamidino-2-phenylidole dihydrochloride [DAPI]). The images show that nonmutant and mutant CALR have an endoplasmic reticulum localization pattern with no increased accumulation in the Golgi. Panel C shows confocal photomicrographs of COS-7 cells transiently coexpressing nonmutant (NM) CALR tagged with green fluorescent protein (GFP) and FLAG-tagged CALR variants. Merge images show that FLAG-tagged CALR variants colocalize with nonmutant CALR to the endoplasmic reticulum. Panel D shows confocal photomicrographs of myeloid cells from CALR mutated and nonmutated granulocyte–macrophage colony-forming unit colonies derived from a patient with essential thrombocythemia with the CALR K385fs*47 mutation; the cells have been stained for protein disulfide isomerase (PDI, a resident protein of the endoplasmic reticulum) and endogenous CALR. Panel E shows flow cytometric analysis indicating the degree of CALR cell-surface expression in granulocytes (G) and lymphocytes (L) from a healthy control and from a patient with mutated CALR. The graph shows the percentage of viable cells expressing cell-surface CALR in peripheral blood from healthy controls (triangles) and patients with mutated CALR (circles). FSC denotes forward scatter. Panel F shows clonal structures in five patients (indicated by their patient-identification numbers) with myeloproliferative neoplasms with mutated CALR, as determined on genotyping of hematopoietic (erythroid) colonies. Each circle represents a clone, with nonmutant clones shown in white and mutant clones in brown. The earliest detectable clone is represented at the top of each diagram, with subsequent subclones shown below. Somatic mutations that were acquired in each subclone are indicated beside the respective nodes and represent those that were acquired in addition to mutations present in earlier subclones. Numbers of colonies that were identified for each node are shown inside the circles. ET denotes essential thrombocythemia, and PET-MF post-ET myelofibrosis.

To confirm that CALR mutations result in a mutant protein product, CALR variant constructs (L367fs*46, deletion, and K385fs*47, insertion) were transiently expressed in human embryonic kidney (HEK) 293T cells. Immunoblotting confirmed the expression of mutant CALR proteins (Figure 4A). Immunofluorescence studies of COS-7 cells showed that mutant CALR did not accumulate in the Golgi apparatus (Figure 4B) and colocalized with exogenously expressed nonmutant CALR (Figure 4C). Similar studies comparing mutated and nonmutated CALR myeloid cells from two patients with CALR-mutated essential thrombocythemia also showed no differences in the pattern of endogenous CALR (Figure 4D).

The up-regulation of cell-surface CALR is reported to mediate the phagocytosis of blasts38 in acute myeloid leukemia and might be a consequence of altered cellular localization of CALR due to loss of KDEL. We measured cell-surface CALR in hematopoietic 32D cells expressing mutant CALR variants (L367fs*46 and K385fs*47) and found no differences from nonmutant CALR expression (Fig. S7 in the Supplementary Appendix). Consistent with these results, there was no significant increase in cell-surface CALR in peripheral-blood leukocytes from five patients with mutated CALR (with estimated tumor burdens of 40 to 100% in granulocytes from exome sequencing), as compared with normal controls (Figure 4E). Our data do not exclude the possibility of partial loss of mutant CALR from the endoplasmic reticulum through the secretory pathway.

CALR Mutations in the Hematopoietic-Stem-Cell Compartment

CALR mutations were identified in patient-derived granulocyte–macrophage colonies and in erythroid colonies (Fig. S8 in the Supplementary Appendix), indicating that the mutations occur in a multipotent progenitor capable of generating both myeloid and erythroid progeny. Consistent with these results, genotyping studies of three patients showed that CALR mutations were present in flow-sorted, highly enriched hematopoietic stem cells (HSC; lin−CD34+CD38−CD45RA−CD90+), common myeloid progenitors (lin−CD34+CD38+CD90−CD10−FLK2+CD45RA−), granulocyte–macrophage progenitors (lin−CD34+CD38+CD90−CD10−FLK2+CD45RA+), and megakaryocyte–erythroid progenitors (lin−CD34+CD38+CD90−CD10−FLK2−CD45RA−) (Fig. S8 in the Supplementary Appendix). These data are consistent with CALR mutations arising in the HSC compartment.

To ascertain whether mutation in CALR is an early event, we used exome-sequencing data to infer the fraction of cells bearing mutations and predict clonal relationships using a Bayesian Dirichlet process (Fig. S9 in the Supplementary Appendix). The results suggest that CALR mutation was an early event in most patients. However, such predictions should be interpreted cautiously because of the low sequencing coverage of CALR and because the approaches we used may not distinguish distinct tumor subclones with similar tumor burdens. Therefore, the order of mutation acquisition was determined in five patients with mutated CALR by genotyping 300 individual hematopoietic colonies for somatic mutations identified on exome sequencing. In all five patients, CALR mutations arose in the earliest phylogenetic node, consistent with mutation of CALR being an initiating event in these patients (Figure 4F).