We observed that the substitution likelihood of a given amino acid mutation (defined as the nucleotide substitution bias and the individual codon path) is correlated with the clinical prevalence of E255V versus E255K.

Thus, we hypothesized that a mutation from E > K might be more likely to occur than a mutation from E > V. Examining the genetic code, E > K requires a transition from G > A, whereas E > V requires a transversion from A > T. No other single-nucleotide substitutions can cause these two mutations. Although transitions are usually more likely than transversions, mutation biases are known to vary across genes and cancer types. We sought a direct measurement of the mutation bias in ABL1 and CML. To do this, we turned to multiple datasets that have analyzed mutations in cancer and variation in the normal human exome. As an example, the Broad ExAC dataset of 100,000 human exomes (see STAR Methods Data S1 ) can be used to estimate ABL1 mutation bias by focusing on the extremely rare variants that constitute 90% of the variation in the ExAC data. Using this dataset, we examined rare (<1/10,000) mutations in the ABL1 gene ( Figure 1 F, top). Splitting these mutations by nucleotide type on the transcribed strand to account for biases because of transcription, we quantified the nucleotide substitution biases in the ABL1 gene. The measured bias was consistent with a transition-transversion bias; i.e., G > A mutations were more likely than A > T mutations. Moreover, it was largely invariant depending upon the exact mutation types analyzed (synonymous, nonsynonymous, and intronic) and consistent with previous literature. This ABL1 mutation bias was also highly correlated with the mutation biases that were measured across the CML exome ( Figure 1 F, bottom).

Next, we sought to examine why E255K might be more prevalent than E255V. We made BaF3 cells that harbored wild-type, E255V, and E255K BCR-ABL (a common model for BCR-ABL targeted therapies) (), and we examined their relative sensitivity to imatinib. As expected, both mutations provided resistance to imatinib, but E255K provided significantly less resistance than E255V ( Figure 1 D). Importantly, this difference in phenotype occurs at clinically relevant concentrations of imatinib, which has a Cof 5.3 μM and a Cof 2.4 μM at steady state (). Although further investigation into the biophysics of P loop dynamics could reveal the molecular mechanism behind this difference in drug resistance, we aimed to understand how a mutation that is worse at conferring drug resistance might become more prevalent in humans. Because the less drug-resistant variant appeared in the clinic more often, we examined the relative growth rate of E255V and E255K in the absence of drug. If E255V grows significantly slower before imatinib exposure, that might help to explain its lower incidence in a human population. However, the relative growth rates for E255V, E255K, and wild-type BCR-ABL cells were statistically indistinguishable (growth rate ± 95% confidence interval [CI] of 0.98 ± 0.075, 1.03 ± 0.125, and 0.96 ± 0.075, respectively; Figure 1 E), despite having 80% power to detect a 10% growth rate difference. However, we do not rule out drug-free fitness effects on the basis of this measurement alone. We include drug-free growth rates for all mutants as a potential explanatory variable during model selection in Figure 3

We first examined two drug resistance mutations that exist at the same residue in the BCR-ABL oncogene: E255. E255 forms a salt bridge interaction with K247, stabilizing the phosphate binding loop (P loop) that clamps over the ATP competitive inhibitor in the active site of the molecule ( Figure 1 B). E255 can mutate to become either E255V or E255K, both of which abolish the charge interaction and promote clinical resistance to imatinib ( Figure 1 C), a BCR-ABL tyrosine kinase inhibitor (TKI) used to treat patients with chronic myelogenous leukemia (CML). Tallying the abundance of E255V and E255K mutations across clinical studies suggested that E255K was more prevalent (21 patients) than E255V (10 patients) (p < 0.05, chi-square test).

Thus, CML can be placed in the low-heterogeneity regime of the parameter space. Altogether, theory and empirical data support the idea that low-heterogeneity populations are most strongly influenced by mutational likelihood.

These theoretical results point toward the correlation of mutation and codon biases observed in real-world CML resistance incidence. The population structure of CML includes a well-characterized leukemic stem cell population of 10–10cells that gives rise to peripheral differentiated leukemic cells ( Figure 2 D). This hierarchy effectively restricts the population size, because only resistance mutations that occur in leukemic stem cells have any clinical consequence ( Figure 2 F).

We then evaluated these probabilities across a region of the effective population size/mutation rate parameter space to identify parameter regimes in which allele B (the less fit, more likely allele) is more likely to drive relapse. We refer to this phenomenon as survival of the likeliest, and it is strongest in the region of the phase plane with low mutation rates and a small effective population ( Figure 2 C), corresponding to populations with low heterogeneity. Here, where the opportunity for mutation is limited, unlikely mutants may not spawn, regardless of their fitness. Thus, under these conditions, it is more important for a resistance mutant to be likely than to be fit. In contrast, as mutation rate and effective population size increase, allele A is more likely to dominate, reflecting a more standard “survival of the fittest” model. Although this theoretical model was parameterized using values relevant to E255V/E255K, the results were qualitatively similar for other ABL1 variants ( Figure S1 ). Thus, the degree of heterogeneity, as shaped by population size and mutation rate, determines when substitution likelihood can play a driving role in drug resistance epidemiology.

The risk of mutation is modeled as a nonhomogeneous Poisson process in which the mutational hazard rate is a function of the size of the sensitive population responsible for seeding resistant clones. The overall probability of mutation to either resistance variant is given by the convergence of that allele’s cumulative distribution function ( Figure 2 B, top). When both alleles are present, the race to outcompete the other allele considers the allele-specific fitness, as well as the time of mutation, as determined by the probability density function of mutational risk ( Figure 2 B, bottom). By considering the joint probability density function of mutation for the two alleles, it is possible to evaluate the probability that either allele drives relapse when both are present. From this model, we derived an asymptotic solution for the probability that either allele drives relapse. See STAR Methods and GitHub for complete descriptions.

Because E255K was more likely to occur than E255V, but not more fit in the presence or absence of drug treatment, we asked whether it makes theoretical sense that a more likely mutation can beat a more resistant one to create de novo drug resistance. To study this, we developed a probability model for the stochastic evolutionary dynamics of a hypothetical drug target with two possible resistance alleles. Allele A is more resistant (more fit in the presence of the drug) but less likely; allele B is less resistant but more likely ( Figure 2 A). The probability that either allele drives relapse was calculated from the allele-specific probabilities of seeding a detectable resistance clone. The probability that allele A is the dominant allele upon relapse is taken to be the sum of the probability that allele A is the only mutant to spawn and the probability that allele A outcompetes allele B when both mutants spawn; the probability that allele B is the dominant allele upon relapse is similarly formulated.

(C) Parameter space indicating the results of our probability model across many mutation rates and effective population sizes. Color indicates whether allele A (dark blue) or allele B (red) is more likely to drive relapse. In regions where allele B is more dominant, we expect mutation bias to be a primary evolutionary force.

(B) Cumulative distribution function (CDF) (top plot) indicates the probability that a mutation has occurred by time t. As t approaches infinity, the CDF approaches the overall probability of mutation used in the two-way table. The probability density function (PDF) (bottom plot) indicates the instantaneous risk of mutation at time t. This is used to determine the dominant allele in competition, when both alleles are present. The time T indicates the time of detection and treatment initiation. See STAR Methods for more details.

Thus, substitution likelihood, rather than drug resistance, is the more significant variable predicting the relative abundance of the 19 mutants that account for 95% of the clinical mutations in BCR-ABL for patients with CML.

The two-variable model regressed on substitution likelihood and normalized ICvalues demonstrated remarkable accuracy, exhibiting a Pearson correlation between model-predicted and observed values of r = 0.76 ( Figure 3 F). This model showed a marked improvement over a two-variable model built on normalized ICvalues and drug-free growth rates (r = 0.27, Figure 3 G). Although the statistical model in Figure 3 F suggested that the degree of resistance and the substitution likelihood of a particular variant were both significant predictors of clinical abundance, substitution likelihood was the more significant (p = 3.3 × 10for substitution likelihood versus p = 0.016 for IC, .rmd file on GitHub). To verify that our result was not overfit, we evaluated our statistical model against the Sanger Institute test set and found that the same two variables exhibited predictive accuracy (Pearson correlation r = 0.74, Figure S2 D). However, in case our data have overlap with the Sanger Institute data, we also verified that the results were similar when our counts were subtracted from the Sanger counts (r = 0.72, Figure 3 H). In addition, we considered the possibility that a fraction of the observations in our meta-analysis were included in the Sanger dataset. Therefore, we resampled the Sanger data by removing random subsets of our own data. Models built on these resampled data continued to exhibit high correlation between observed and model-predicted counts ( Figure S2 E). The reproducibility in a second dataset highlights the significance of substitution likelihood as the most important predictive variable.

Negative binomial regression was used to predict the incidence of individual resistance mutants across the human population ( Figure S2 C; .rmd file on GitHub). We identified the best N-variable model by leave-one-out-cross-validation (LOOCV). By this metric, a two-variable model built on substitution likelihood and normalized ICvalues outperformed all others ( Figure 3 D). To verify the superior predictive power of these two variables, we also identified a likely independent dataset of different ABL1 mutation frequencies that is curated by the Sanger Institute. Although these mutations have a less clear clinical provenance than our analysis, we decided to use them as a test set, because they had considerably more data. Again, substitution likelihood and normalized ICoutperformed all other combinations of predictor variables as evaluated by LOOCV ( Figure 3 E).

Beyond genetic-background-normalized ICmeasurements, we measured the drug-free growth rates of all ABL1 mutants from 6–10 independent lentiviral infections ( Figure S2 B). Many independent infections were critical, because the growth rates of individual clonal selections exhibit high variance. We also measured the substitution likelihood of a given amino acid conditioned upon a resistance mutation event using the frequencies from the ExAC analysis ( Figure 3 C; see STAR Methods and GitHub).

For each of these 19 amino acids, we generated three independent isogenic BaF3 cell lines and measured the ICof imatinib in each cell line in triplicate across 11 serial drug dilutions ( Figure 3 B). Although this set of 19 clinical point mutations in ABL1 is the most systematically assembled in vitro dataset to date, many of these mutations have had their ICvalues measured in other resistance studies in other labs. Thus, the literature presents us with an opportunity to estimate how genetic context and experimental conditions might affect drug response. Figure S2 A shows that cross-study correlations are high (Pearson’s r = ∼0.9 or higher for four studies), but systematic shifts exist between studies. This might suggest that the genetic drift in cell lines in an individual lab can influence drug potency and is consistent with recent findings (). Thus, we have an opportunity to account for the effects of genetic background upon drug resistance. To do this, we normalized all systematic differences between the datasets into one mean dataset and combined all data from the literature with our data (see STAR Methods Data S3 ). This cross-study approach leverages the genetic drift across cell lines to get the best estimate of the average isogenic effect of BCR-ABL mutations.

To approach this problem more comprehensively, we gathered data from hundreds of parallel clinical evolution experiments in BCR-ABL+ leukemias across four continents over 17 years ( Data S2 ). Specifically, we identified clinical trials with clear clinical resistance criteria and codon-level resolution. 268 high-confidence clinical cases of imatinib resistance were identified in these studies. Tallying mutations at individual amino acid positions, we found that the 19 most abundant amino acid mutations account for ∼95% of the resistance mutations identified ( Figure 3 A). Notably, 85% of the resistance is caused by a mutation other than the fittest (T315I; imatinib IC= 8,711 nM, where ICis defined as the concentration to reach 50% of maximum inhibition).

(H) Observed versus predicted plot (as in F) using the Sanger data with the counts from our meta-analysis omitted, to consider the possibility that the Sanger data included all counts from our curated dataset. The Pearson correlation is 0.72.

(G) Observed versus predicted plot (as in F) for the regression model of clinical mutation prevalence built only on the growth rates in the presence and absence of imatinib. The Pearson correlation is 0.27.

(F) Observed versus predicted plot for a regression model of clinical mutation prevalence (determined by our clinical meta-analysis) regressed against growth rate in the presence of drug and substitution likelihood (which is the final model). Points are specific ABL1 mutations; x values are their observed frequency, and y values are their frequency predicted by the model. The Pearson correlation for observed and predicted prevalence is 0.76.

(D) Model selection for negative binomial regression. LOOCV was used to estimate the test error of the N-variable model with the lowest Akaike information criterion (AIC) for each N. The N = 1 model includes substitution bias, the N = 2 model includes substitution bias and IC 50 , and the N = 3 model includes substitution bias, IC 50 , and growth rate in the absence of drug. The N = 2 model had the lowest estimated mean square error. See .rmd on GitHub for more details.

(C) Table that includes values used to build the final regression models: the frequency of each resistant mutant as determined by our clinical meta-analysis, imatinib ICvalues (in nanomolar) normalized for genetic background, and substitution likelihood calculated from analysis of the Broad ExAC data in Figure 1 F.

(B) Drug-dose response was measured by Cell-Titer Glo and is plotted in the heatmap. n = 3 independent infections for all 20 cell lines (wild type [WT] and mutants). Each BCR-ABL BaF3 line was dosed with 11 serial dilutions of imatinib in triplicate. All cell lines are ordered by sensitivity. For each line, three biological replicates were conducted, each with two technical replicates.

(A) Crystal structure ribbon diagram of the ABL1 kinase domain and distribution of the 19 most prevalent BCR-ABL resistance mutants. These 19 variants account for approximately 95% of resistance mutations observed clinically in the six studies from Figure 1 C.

Detection of BCR-ABL mutations in patients with CML treated with imatinib is virtually always accompanied by clinical resistance, and mutations in the ATP phosphate-binding loop (P-loop) are associated with a poor prognosis.

These results demonstrate that a first-principle, mechanistic model can predict the distribution of mutations seen in the clinic if it contains parameters that account for substitution likelihood. This improvement is consistent across imatinib, dasatinib, and nilotinib and underscores the importance of substitution likelihood in these models.

To appraise the predictive power of the model for other drugs, we repeated this process of measuring ICvalues for all 19 resistance variants, simulating pharmacokinetic profiles, and parameterizing an evolutionary model for second-generation BCR-ABL inhibitors nilotinib and dasatinib. Both TKIs have been previously evaluated in frontline clinical trials (). However, the count data for clinical resistance to nilotinib and dasatinib are sparse (0–3 patients for most variants) and suffer from wide CIs. Thus, to more completely evaluate our models’ predictive power in the context of these two drugs, we examined categorical accuracy, i.e., the ability to say whether a mutation should or should not show up in a clinical trial. To do this, we downsampled our dasatinib and nilotinib simulations to reflect the size of the respective frontline trial. These virtual clinical trial results informed a binary classifier model—if a mutation was present, it was categorized as resistant; if it was not present, it was categorized as sensitive. We repeated this for 10simulations, each time constructing a receiver operating characteristic (ROC) curve to quantify the accuracy of the binary classifier against the real-world clinical data. Averaging the ROC curves across all simulations ( Figures 4 E and 4F), we found that substitution likelihood considerably improved the models’ ability to classify a variant as a resistance mutation (area under the curve AUC = 0.79 without versus 0.91 with mutation bias for dasatinib; AUC = 0.65 versus 0.82 for nilotinib). Given that these predictive models were not fit to patient data, they could be used to forecast which resistance mutations will be identified in a CML clinical trial a priori.

Importantly, we ran simulations for a null model without substitution likelihood, in which individual amino acid changes were assigned the same probability, and compared its predictive value to that of a model parameterized with substitution likelihood. In examining the distribution of mutations across the ABL1 kinase, a model that considered substitution biases was required to accurately predict the abundances of individual mutations ( Figures 4 C and 4D; r = 0.21 without versus r = 0.84 with substitution likelihood).

We simulated this system of 60 differential equations stochastically ( Figure 4 B) for 10,000 virtual CML patients treated with imatinib (see STAR Methods ). Stochasticity allows the potential for random mutation to seed each resistant variant. Each patient was assigned a pharmacokinetic profile from clinically observed distributions of in vivo drug concentration (). Similarly, patient-specific tumor detection sizes were drawn from real-world distributions (). Across these 10,000 simulations, we totaled the number of in silico patients who relapsed with each mutation (see STAR Methods ).

Importantly, this methodology does not rely on fitting the model to the prevalence of specific resistance mutations. Therefore, it only requires a mechanistic model of treatment (i.e., birth/death rates), a list of candidate mutations (which could be generated from structure-based design or mutagenesis) ( Figure S3 A), measurements of substitution likelihood, and a tractable in vitro system to measure the effects of putative resistance mutations.

We first identified a simple, clinically parameterized model of CML treatment that incorporates hematopoietic stem cell division and differentiation (). To adapt the system to our question, we added our 19 resistance variants of interest to the model ( Figure 4 A). Each mutant was parameterized with an allele-specific substitution probability and a drug kill term. We used cell-based measurements in the presence of human serum to appropriately scale drug exposures to the effective in vivo levels (see STAR Methods ).

(H) Maxitinib simulation results. Orange bars represent the resistance incidence of each maxitinib chemotype relative to imatinib. Red points indicate mutational liability, defined as the sum of conditional substitution likelihoods of mutations that confer resistance to each chemotype.

(G) Resistance profiles for versions of a hypothetical drug, maxitinib. Maxitinib K1 targets the first through fifth most likely mutants, K2 targets the second through sixth most likely mutants, and so on.

(F) ROC curve for simulations using parameters for nilotinib, as in (E). Again, the model with mutation bias was more predictive than the one without mutation bias (AUC = 0.82 with bias versus AUC = 0.65 without bias).

(E) ROC curves indicating the accuracy of simulation results for the stochastic model using parameters for dasatinib. The model-predicted allele frequency was used to train a binary classifier of clinical resistance, and its predictive power was evaluated against the true positive and negative classes as identified in a frontline dasatinib clinical trial. The stochastic model that included mutation bias (AUC = 0.92) outperformed that without bias (AUC = 0.79).

(A) Schematic of a stochastic CML evolutionary dynamic model. The initial deterministic model of three differential equations (shaded in light blue) is fromand is fit to phase III clinical data. We reformulated a stochastic version of 60 differential equations parameterized fromand our clinical data. Leukemic stem cells alternate between the proliferating state (P-LSC) and the quiescent state (Q-LSC). P-LSCs give rise to differentiated leukemic cells (DLCs). P-LSCs may also spawn a resistant subclone P-LSCwhen dividing. The allele-specific mutational probability is given by ρ. We added the ability for all 19 resistance mutations to occur, such that there are 20 sets of differential equations with three populations per mutant. The system is solved stochastically.

Our experiments and epidemiological analysis indicated that an understanding of substitution likelihood is necessary to predict the clinically observed distribution of specific resistance mutations. Because of this, we wanted to know whether we could predict the frequency of mutations from a mechanistic model built from first principles.

Considerations relating to structure, conformation, and intermolecular interactions complicate drug design and would likely preclude the development of a drug that targets exactly the N most likely resistance mutants. However, as small molecules are screened and tradeoffs are identified, evolutionary models could choose a molecular profile that predicts less resistance when used in the clinic. Our models suggest that designing molecules in this way could create second-generation inhibitors that minimize drug resistance across a population ( Figure S3 C).

The model predicts a 67% reduction in resistance incidence relative to imatinib for maxitinib K1, the hypothetical chemotype that targets the five most likely mutants ( Figure 4 H). This reduction is larger than either nilotinib or dasatinib (even though at least as many mutations confer resistance to maxitinib K1 as to nilotinib or dasatinib), and predicts that the evolutionarily optimal chemotype would prevail in the clinic. At the same time, maxitinib K15, which targets the five least likely mutants, offers virtually no advantage over imatinib. To be sure that the most prevalent resistance allele, T315I, was not dominating the observed effect, we removed T315I from the model and reperformed the simulations. Under these conditions, there was still high agreement between resistance incidence and mutational liability across the various chemotypes ( Figure S3 B).

To test this, we simulated a cohort of in silico frontline CML patients for several target profiles of our hypothetical drug maxitinib; we name these distinct versions of maxitinib K1 to K15. Each of the hypothetical drugs, K1–K15 denote a version of maxitinib that was designed to target a different set of five of the 19 previously discussed imatinib-resistant mutants. The top five most likely mutants are sensitive to maxitinib K1, the second through sixth most likely mutants are sensitive to maxitinib K2, and so on ( Figure 4 G; see STAR Methods and GitHub for a model description).

To further investigate how an evolutionarily informed approach to drug design might affect the clinical prevalence of resistance, we conceived of a hypothetical BCR-ABL TKI, which we call maxitinib, designed with mutational liabilities in mind. Given the same number of mutational vulnerabilities in a target (here we use five), maxitinib would be designed via structure-based drug design to target the five most likely mutants. If this could be achieved, how would that alter the overall incidence of resistance that arises in the clinic?

Mutation Biases Can Affect the Epidemiology of Resistance for Diseases Treated in the Adjuvant Setting

Predictive modeling of resistance in pathologies beyond CML requires an understanding of what evolutionary forces determine the resistance allele prevalence in disease-specific settings. Although CML is a dramatic example in which substitution likelihood plays a dominant role in resistance evolution, a small-tumor-initiating population is not the only condition that restricts effective population sizes in cancer. In adjuvant therapy, a large tumor is surgically debulked before systemic treatment is initiated. Population bottlenecks created by surgical resection can considerably reduce the genetic diversity of tumor cells. We therefore hypothesized that mutation bias could shape the development of resistance to drugs used in the adjuvant setting.

9–1012 cells, modeling a range of dissemination), at which point most of the tumor is excised and the remaining population (105–108 cells, modeling completeness of excision) is treated with an inhibitor. During division events, sensitive cells can spawn one of ten resistant alleles, each with randomly preassigned allele-specific resistance and mutational probability parameters. For each combination of tumor size pre- and post-surgery, 50,000 patients were simulated, and their dominant resistance alleles were tallied upon relapse. The Spearman rank correlation of allele frequency with the degree of drug resistance (7 or fewer cancer cells remained following resection, the allele frequency at relapse was more highly correlated with substitution bias than the degree of drug resistance conferred ( Figure 5 Mutation Bias Affects the Epidemiology of Resistance for Diseases Treated in the Adjuvant Setting Show full caption (A) Adjuvant therapy evolutionary model example results. Simulation results for population before resection M pre = 109 and after resection M post = 105 (left) and for M pre = 1012 and M post = 108 (right). Points represent specific resistance variants, and ρ values are the Spearman rank correlation between frequency and degree of resistance (top) and frequency and substitution likelihood (bottom). (B) Summary of simulation results for various values of M pre and M post . Colors represent the difference in Spearman correlation (substitution likelihood ρ − resistance ρ) for each set of parameters. (C) Schematic detailing clinical meta-analysis. For drug targets in GISTs, prostate cancer, and breast cancer, acquired resistance mutations were tallied and classified by pyrimidine substitution. (D) Observed resistance allele distribution for the three cancer types. Colors indicate whether the nucleotide substitution associated with each mutation is a transition or a transversion. (E) Distribution simulated by reassigning the transition-transversion class of each variant. The transition frequency in each simulation was calculated. The histogram shows the distribution of those frequencies. The red arrow indicates the transition frequency for the observed data (0.6). The associated empirical p value is 0.008, suggesting that the observed data cannot be explained by a null model with no differences in probability between transitions and transversions. (F) Distribution as in (E), but with mutation classes (e.g. C > A, C > G, etc.). (G) Combining the counts from the three adjuvant diseases with those of ALK+ and EGFR+ NSCLC (two nonadjuvant indications) decreases the significance detected by the model. This suggests that the observed significance in the adjuvant diseases is not an artifact of aggregating data and large sample size. Using a mechanistic model of tumor evolution, we simulated tumors treated with adjuvant therapy (see STAR Methods ). In the model, a tumor grows (until 10–10cells, modeling a range of dissemination), at which point most of the tumor is excised and the remaining population (10–10cells, modeling completeness of excision) is treated with an inhibitor. During division events, sensitive cells can spawn one of ten resistant alleles, each with randomly preassigned allele-specific resistance and mutational probability parameters. For each combination of tumor size pre- and post-surgery, 50,000 patients were simulated, and their dominant resistance alleles were tallied upon relapse. The Spearman rank correlation of allele frequency with the degree of drug resistance ( Figure 5 A, top) and allele frequency with substitution bias ( Figure 5 A, bottom) were calculated for each combination of population size pre- and post-surgery. The simulation results show that when 10or fewer cancer cells remained following resection, the allele frequency at relapse was more highly correlated with substitution bias than the degree of drug resistance conferred ( Figure 5 B).

2, To evaluate our theoretical results, we turned to clinical data for indications for which adjuvant therapy is often used. In particular, we analyzed on-target resistance mutations in the androgen receptor (AR) for prostate cancers with AR antagonists, in ESR1 for breast cancers treated with estrogen receptor (ER) antagonists, and in c-KIT for GISTs treated with imatinib ( Figure 5 C; Data S4 ). Some characteristics of the available clinical data complicate the analysis. Resistance data for these diseases were generally not delimited by whether the drug was administered adjuvantly, though we know based upon patient selection criteria that adjuvant patients were included in our cohorts. This ambiguity gives rise to a less pure, and likely more conservative, dataset, because our meta-analysis likely includes some patients with advanced metastatic disease. This uncertainty, combined with drug fitness data being unavailable for most resistance variants in the three drug targets, currently precludes an analysis identical to the one made in CML described in Figures 1 3 , and 4

50 data and the population allele frequency for these cancers ( 50 and clinical abundance ( We first noted that in vitro drug-sensitivity data are available in the literature for the most abundant resistance mutations for the three adjuvant diseases. The single-most clinically abundant allele was not the most drug resistant, contrary to what would be expected under a pure “survival of the fittest” model of evolution. Furthermore, there was poor correlation between the available ICdata and the population allele frequency for these cancers ( Figure S4 A). This suggests that evolutionary forces beyond degree of resistance may shape the distribution of resistance alleles. We performed a similar analysis for on-target resistance mutations in ALK+ non-small-cell lung carcinoma (NSCLC) patients treated with crizotinib and in EGFR+ NSCLC patients treated with erlotinib or gefitinib. In NSCLC, targeted therapy is generally administered in the absence of surgical resection; thus, we would expect the larger effective population size to favor fitness over substitution likelihood ( Figure 2 F). For these two indications, the most frequent variant was also the most drug-resistant, and there was higher correlation between ICand clinical abundance ( Figure S4 B). Although these data are suggestive of complexity, they are not direct evidence that mutation bias influences resistance evolution in diseases treated in the adjuvant setting.

Payne et al., 2019 Payne J.L.

Menardo F.

Trauner A.

Borrell S.

Gygli S.M.

Loiseau C.

Gagneux S.

Hall A.R. Transition bias influences the evolution of antibiotic resistance in Mycobacterium tuberculosis. Stoltzfus and McCandlish, 2017 Stoltzfus A.

McCandlish D.M. Mutational Biases Influence Parallel Adaptation. Given the constraints of the clinical data, we turned to an established technique for detecting mutational biases. The method has been used previously in cases of adaptive evolution within and across species (). Specifically, the statistical test was developed to detect a simple case of mutation bias: transition-transversion bias. The approach involves aggregating observations of parallel evolution—in this case, drug resistance in cancer patients—and tallying the number of patients who develop each resistance mutation to determine the distribution of mutations (as in Figure S5 A). Each mutation is classified according to the underlying nucleotide substitution type (transition or transversion), and the observed patient-level transition frequency is noted. Then, this distribution of mutations is simulated under the null assumption that mutation bias does not influence evolution. That is, the dataset is simulated to reflect the expected outcome when variation in mutation frequency is decoupled from mutation bias. To do this, we reassign each mutation as a transition or a transversion according to its expected probabilities given no bias in mutation, and then we calculate the patient-level transition frequency. If the observed transition frequency is more extreme than those generated by the model, there is evidence that mutation bias influences evolution in the case at hand.

Payne et al., 2019 Payne J.L.

Menardo F.

Trauner A.

Borrell S.

Gygli S.M.

Loiseau C.

Gagneux S.

Hall A.R. Transition bias influences the evolution of antibiotic resistance in Mycobacterium tuberculosis. Stoltzfus and McCandlish, 2017 Stoltzfus A.

McCandlish D.M. Mutational Biases Influence Parallel Adaptation. 10 cells pre-resection and 106 cells post-resection, and then we simulated these conditions with and without mutation bias. For the virtual cohort generated by simulations that included mutation bias, we found that the statistical test detected an overabundance of transitions (p = 0.001, Although this statistical analysis is the literature standard for detecting mutational biases in parallel adaptive evolution (), we noted that it had not previously been validated with simulated data. Therefore, we revisited our stochastic model of adjuvant therapy, for which we know that for small post-surgery population sizes, allele frequency is more correlated with mutation bias than with resistance. We considered the case with 10cells pre-resection and 10cells post-resection, and then we simulated these conditions with and without mutation bias. For the virtual cohort generated by simulations that included mutation bias, we found that the statistical test detected an overabundance of transitions (p = 0.001, Figures S5 A–S5C). Conversely, the results of our control case simulated without mutation bias were not significant (p = 0.53, Figures S5 D–S5F). Thus, our simulations confirmed the utility of the null model in detecting mutation bias, even with other confounding factors such as fitness.

We then turned to our meta-analysis for diseases treated with adjuvant therapy. The resampling technique indicated that the observed transition-transversion ratio ( Figure 5 D) exceeded those generated under the null model (p = 0.008, Figure 5 E), suggesting that mutation bias influences the evolution of resistance in these diseases. We then expanded the resampling technique to develop a model of mutation bias that was more similar to our CML analysis by considering the six possible pyrimidine substitutions (C > A, C > G, C > T, T > A, T > C, and T > G, hereafter called mutation classes). We performed this similar analysis of simulating the resistance distribution by randomly reassigning the mutation class associated with each variant and tallying the number of patients in each class (simulated data in Figures S5 G–S5N). We then calculated the Mahalanobis distance from the mean for both the simulated and the observed data ( Figure S5 O) and found that the observed data were more extreme than those simulated under the null model (p = 0.045, Figure 5 F). See STAR Methods for more details.

Stoltzfus and McCandlish, 2017 Stoltzfus A.

McCandlish D.M. Mutational Biases Influence Parallel Adaptation. This effect, although significant, is still not as strong as in ABL1+ CML (p = 0.018, Figure S5 P). As previously noted, the data gathered across clinical trials likely include patients who did not receive adjuvant care, diluting the effect of restricted population sizes. Furthermore, even for those tumors that were treated in the adjuvant setting, it is likely that the effective population size and mutation rates in these diseases place them closer to the border of the phase change in Figure 2 F. Given the understanding that mutation bias is likely to play a somewhat weaker role in resistance evolution for adjuvant diseases, we had aggregated the observations from prostate cancer, breast cancer, and GISTs to maximize our power to detect an effect. Prior studies that use this resampling technique also aggregated their data (), combining observations across many taxa, mutation rates, and putative fitness effects. Even so, we wanted to confirm that our rejection of the null hypothesis was not an artifact of a large, aggregated dataset. To address this, we considered the distribution of resistance mutations in ALK+ and EGFR+ NSCLC, two nonadjuvant indications, as well. The aggregated NSCLC data are not significant on their own (p = 0.45). Furthermore, combining the NSCLC counts to our dataset for adjuvant diseases does not simply make the results more significant (p = 0.44 for all diseases versus p = 0.045 for adjuvant diseases alone, Figure 5 G). This suggests that for the adjuvant diseases, the observed significance was not an artifact of aggregated data and increased sample size. This result is supported by a forward selection method used to determine the combinations of disease that show significant evidence of mutation bias. Such an approach identifies the most significant aggregated dataset as the one that includes the adjuvant diseases and excludes the nonadjuvant diseases ( Figure S5 Q; see STAR Methods for details).

These results indicate that the approach outlined here is a sensitive and specific method to identify cases in which substitution likelihood shapes drug resistance evolution. Such cases represent indications for which next-generation drug design could be informed by analysis of mutation biases and substitution likelihood.