Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Ayelet Erez ( ayelet.erez@weizmann.ac.il ).

SKOV, U2OS and HepG2 cell lines was purchased from the ATTC. LOX IMVI was purchased from NCI60 (RRID:CVCL1381). MC38 cell line, derived from mouse colon adenocarcinoma was kindly provided by Dr. Eran Elinav Department of Immunology, The Weizmann Institute of Science. CT26 cell line, derived from mouse colon cancer was kindly provided by Professor Avigdor Scherz, Department of Plant Sciences, Weizmann Institute of Science. OTC and CPS1 deficient fibroblasts were purchased from Coriell Institute for Medical Research. Cells were cultured using standard procedures in a 37°C humidified incubator with 5% CO2 in Dulbecco’s Modified Eagle’s Medium (DMEM) (gibco) or RPM1 1640 medium (gibco) supplemented with 10%–20% heat-inactivated fetal bovine serum, 10% pen-strep and 2 mM glutamine. 4T1 cell line derived from mouse breast cancer cells as well as MCF10A and its medium was kindly provided by Professor Yossi Yarden, Department of Biological Regulation, Weizmann Institute of Science. All cells were tested routinely for Mycoplasma using Mycoplasma EZ-PCR test kit (#20-700-20, Biological Industries, Kibbutz Beit Ha’emek).

Urea analysis: To estimate age-stratified Urea background levels we pooled data from X = 1,363,691 patients in the de-identified Clalit Health Services electronic health record (code 0194-17-COM2), and y = 100 pediatric de-identified cancer patients on their day of admission to the Pediatric Hemato-Oncology Department at Souraski Medical Center (code 0016-17).

All patients’ urine samples were obtained upon informed consent and with evaluation and approval from the corresponding ethics committee (CEIC code OHEUN11-12 and OHEUN14-14) (). Patients included in the study were men diagnosed with prostate adenocarcinoma and the criteria for inclusion was to be scheduled for surgery as anticancer treatment. Samples were collected between 2012 and 2016.

Animal experiments were approved by the Weizmann Institute Animal Care and Use Committee Following US National Institute of Health, European Commission and the Israeli guidelines (IACUC 21131015-4). To generate syngeneic mouse cancer models, 8 weeks old C57BL/6, SCID or BALB/c male and female mice were purchased from Envigo and randomly assigned to experimental groups.

Method Details

Deregulation of urea cycle components in TCGA samples UCD − score = − A S L − A S S 1 + C P S 1 − O T C + S L C 25 A 13 − S L C 25 A 15 , (1)

where the names of genes denote their gene expression levels. We divided the tumor samples equally into 5 bins based on the UCD-score and compared the CAD expression across these bins (where the CAD expression is rank-normalized in each cancer type to control for cancer types) using a Wilcoxon rank sum test. We downloaded TCGA gene expression profiles of 7,823 patients (4,723 GTEx healthy tissue samples) encompassing 25 cancer types for which there is corresponding healthy control samples via UCSC Xena browser ( https://xena.ucsc.edu/ ; see Data and Software Availability ). We compared the expression of the 6 genes involved in urea cycle (ASL, ASS1, CPS1, OTC, SLC25A13, and SLC25A15) in these cancer versus healthy tissue samples using one-sided Student’s t test in the direction of UC genes’ dysregulation as described below. Components with significant fold changes in specific tumor types are presented in Figure 1 B. A comparable but less prominent trend was observed with two-sided test. Arginase was excluded as its addition did not show any effect on all evaludated parameters (data not shown). Urea cycle deregulation is a result of coordinated alterations in urea cycle enzyme activities, where CPS1 and SLC25A13 tend to be upregulated, while ASL, ASS1, OTC and SLC25A15 tend to be downregulated to increase substrate supply to CAD and enhance pyrimidine synthesis. For each sample, we then determine its UC genes dysregulation score (UCD-score), which is a weighted sum of the expression of 6 UC genes, where 1 or −1 was assigned as weights depending on the direction implied by the UCD signature defined above (thus it can range from a positive to a negative value), i.e.where the names of genes denote their gene expression levels. We divided the tumor samples equally into 5 bins based on the UCD-score and compared the CAD expression across these bins (where the CAD expression is rank-normalized in each cancer type to control for cancer types) using a Wilcoxon rank sum test.

Joint transcriptomic and metabolomics analysis of tumor samples Terunuma et al., 2014 Terunuma A.

Putluri N.

Mishra P.

Mathé E.A.

Dorsey T.H.

Yi M.

Wallace T.A.

Issaq H.J.

Zhou M.

Killian J.K.

et al. MYC-driven accumulation of 2-hydroxyglutarate is associated with breast cancer prognosis. Roessler et al., 2012 Roessler S.

Long E.L.

Budhu A.

Chen Y.

Zhao X.

Ji J.

Walker R.

Jia H.L.

Ye Q.H.

Qin L.X.

et al. Integrative genomic identification of genes on 8p associated with hepatocellular carcinoma progression and patient survival. We analyzed recently published data of joint transcriptomic and metabolomic measurements across 58 breast cancer (BC) tumors with controls () and 29 such samples in hepatocellular carcinoma (HCC) () to further study the association between UCD and metabolites levels in clinical samples. For each tumor sample, we computed a score denoting the ratio of pyrimidine to purine metabolite levels in the given sample, and we then divided the samples into two groups based on their UCD scores and performed Wilcoxon rank sum test comparing the two groups, in each of these two cancer types.

Urea cycle dysregulation in the progression of melanoma Kabbarah et al., 2010 Kabbarah O.

Nogueira C.

Feng B.

Nazarian R.M.

Bosenberg M.

Wu M.

Scott K.L.

Kwong L.N.

Xiao Y.

Cordon-Cardo C.

et al. Integrative genome comparison of primary and metastatic melanomas. Thompson et al., 2014 Thompson P.G.

Smouse P.E.

Scofield D.G.

Sork V.L. What seeds tell us about birds: a multi-year analysis of acorn woodpecker foraging movements. We analyzed gene expression data of four types of skin samples from human subjects, namely normal skin (n = 8), nevi (n = 9), primary (n = 31) and metastatic tumor samples (n = 73) (). UCD-score was calculated in each sample and compared between the four distinct types of samples in the order of progression using Wilcoxon rank sum test. In comparing the primary and metastatic tumor samples, we controlled for patient age through multi-class rank sum test using R library ‘nestedRanksTest’ ().

TCGA whole exome-seq analysis Gao et al., 2013 Gao J.

Aksoy B.A.

Dogrusoz U.

Dresdner G.

Gross B.

Sumer S.O.

Sun Y.

Jacobsen A.

Sinha R.

Larsson E.

et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. We downloaded TCGA mutation profiles for 4,858 tumors encompassing 16 cancer types from cbioportal () on Feb 1, 2017. For mutation analysis, we used the data from cbioportal as it integrates the mutation analysis from different TCGA centers to avoid center specific bias in mutation calls. (For the analyses described in the STAR Methods Section “ Deregulation of urea cycle components in TCGA patient samples, ” we used the TCGA data from UCSC Xena browser [ https://xena.ucsc.edu/ ] because it includes TCGA RNAseq data normalized together with GTEx healthy tissue samples, which are absent in cbioportal.) We focused on 4,422 samples that have more than 20 point mutation events (where sufficient R- > Y and Y- > R nonsynonymous mutation event is expected) since our analysis (as described below) requires sufficient number of mutation events in each sample. This results in 758,282 single point mutation events (including 535,296 non-synonymous mutations) in 16 cancer types (see Data and Software Availability ). PTMB = N ( R → Y ) − N ( Y → R ) m u t a t i o n a l l o a d = f ( R → Y ) − f ( Y → R ) , (2)

where N(R- > Y) and N(Y- > R) denote the number of R- > Y and Y- > R single nucleotide polymorphisms (SNPs) on DNA sense strand, respectively, and ‘mutational load’ is the total number of SNPs in a given sample. PTMB was calculated using nonsynonymous SNPs unless explicitly denoted otherwise. A sample was marked to be PTMB-biased if its PTMB level is greater than zero, while a sample with significant UC dysregulation were determined if its UCD-score is greater than the median rank-normalized UCD-score of the corresponding healthy tissue ( To study the Pyrimidine-rich Transversion Mutational Bias (PTMB) we consider the fraction of transversions from puRines (R) to pYrimidines (Y), f(R- > Y) which denotes the ratio of R- > Y point mutations to all point mutations on the DNA sense strand in a given sample. The fraction of transversions from pyrimidines to purines, f(Y- > R), is defined in an analogous manner. PTMB is defined as the difference between the two fractions, i.e.where N(R- > Y) and N(Y- > R) denote the number of R- > Y and Y- > R single nucleotide polymorphisms (SNPs) on DNA sense strand, respectively, and ‘mutational load’ is the total number of SNPs in a given sample. PTMB was calculated using nonsynonymous SNPs unless explicitly denoted otherwise. A sample was marked to be PTMB-biased if its PTMB level is greater than zero, while a sample with significant UC dysregulation were determined if its UCD-score is greater than the median rank-normalized UCD-score of the corresponding healthy tissue ( Table S1 ). We analyzed the association between UCD and PTMB using two different approaches: (1) we compared PTMB in UC dysregulated samples (UC-Dys; top 45% UCD scores) versus UC intact samples (UC-WT; bottom 45% UCD scores) at the pancancer level using a Wilcoxon rank sum test. We considered PTMB derived from (i) nonsynonymous SNPs, (ii) all SNPs (including both synonymous and nonsynonymous), and (iii) non-exon SNPs (including introns, UTR, intergenic region, splice sites and transcription start sites). For non-exon SNPs, we considered 175 TCGA samples having more than 20 non-exon mutations in each sample. We confirmed that the association between UCD and PTMB is not a mere consequence of UCD samples carrying more purines on the original sense strand (or UC-intact samples carrying more pyrimidines on the original sense strand) (Wilcoxon rank sum p > 0.3). Thompson et al., 2014 Thompson P.G.

Smouse P.E.

Scofield D.G.

Sork V.L. What seeds tell us about birds: a multi-year analysis of acorn woodpecker foraging movements. We have examined the UCD/PTMB association while controlling for each of the 30 mutational signatures ( https://cancer.sanger.ac.uk/cosmic/signatures ) from catalogue of somatic mutations in cancer (COSMIC), one at a time, using an extension of Wilcoxon rank sum test for multi-class data using R library ‘nestedRanksTest’ (). For each signature, we divided the samples into two classes depending on the exposure of the mutational signature in the given sample (using median contribution level as threshold). We then performed a multi-class Wilcoxon rank sum test to test for the association of UCD-score with PTMB, taking into account different cancer types (FDR-corrected p < 0.05 for all mutational signatures). To make sure UCD itself is not associated with the known mutational signatures, we evaluated the association between UCD-score and the exposure to 30 mutational signatures in each cancer type. When only the mutational signatures enriched with transversion mutations were considered in each cancer type (COSMIC mutational signatures 3, 4, 5, 6, 8, 9, 10, 13, 14, 17, 18, 20, 28, 29; https://cancer.sanger.ac.uk/cosmic/signatures ), we found that UCD significantly correlates with mutational signature 5 and 6 in pancreatic cancer (Spearman R = 0.88 for both, FDR < 0.007 and 0.01, respectively), constituting a very small fraction of the space of possible associations. Alexandrov et al. (2016) Alexandrov L.B.

Ju Y.S.

Haase K.

Van Loo P.

Martincorena I.

Nik-Zainal S.

Totoki Y.

Fujimoto A.

Nakagawa H.

Shibata T.

et al. Mutational signatures associated with tobacco smoking in human cancer. (2) We analyzed the correlation across cancer types between median UCD-scores and median PTMB levels of each cancer type. To evaluate the potential confounding effect of smoking (whose mutational signature is enriched with transversion mutation), we obtained the smoking annotation of TCGA samples from, and repeated the same analysis while removing (i) two lung cancer types (LUAD and LUSC), (ii) smokers in the two lung cancer types and (iii) all smoker samples. We checked whether there is a significant increase in the correlation between the UCD and PTMB scores after removal (e.g., Figure S5 C) compared to that before removal ( Figure 5 B). When we remove all smokers in the analysis, we are left with sufficient number (n > 10) of samples for 9 out of 16 cancer types ( Figure S5 E) (nevertheless, we still observe a strong correlation (though with marginal significance) between UCD-scores and PTMB across cancer types).

Selective advantage of purine-to-pyrimidine mutation in UC dysregulated tumor Nei and Gojobori [1986] Nei M.

Gojobori T. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Beerenwinkel et al., 2015 Beerenwinkel N.

Schwarz R.F.

Gerstung M.

Markowetz F. Cancer evolution: mathematical models and computational inference. Kryazhimskiy and Plotkin, 2008 Kryazhimskiy S.

Plotkin J.B. The population genetics of dN/dS. We assessed the strength of selection (dN/dS), the ratio between the rate of nonsynonymous substitutions (dN) and the rate of synonymous substitutions (dS) for different types of mutations. Generally, dN and dS are computed in two steps: (i) assessing the number of nonsynonymous substitutions (N) per nonsynonymous site (pn = N/n; n the number of N sites in the compared region) and the number of synonymous substitutions (S) per synonymous site (pS = S/s; s the number of S sites in the compared region), and (ii) applying methods that transform pN to dN and pS to dS, accounting for the possibility that a given site is mutated more than once (e.g.,). However, because during tumor evolution the probability that a single site is mutated more than once is low (), we approximate the rates dN, dS by pN and pS, respectively. S A = ( dN dS ) R → Y ( dN dS ) o t h e r s = ( pN pS ) R → Y ( pN pS ) o t h e r s = N R → Y S R → Y / N o t h e r s S o t h e r s , (4)

where (dN/dS) R- > Y and (dN/dS) others denote the selection of R- > Y mutations and of all other types of mutations, respectively. We then compared the selection advantage of R- > Y mutation specific to UCD by calculating the ratio between selection advantages of UC dysregulated versus UC intact samples. We considered TCGA samples that have at least 20 mutations (to focus on the samples with sufficient nonsynonymous (N > 15) and synonymous (N > 5) SNPs), leading to 1,313 samples in 16 cancer types. Specifically, to assess the selection advantage (SA) of R- > Y mutation relative to all other types of mutations (i.e., R- > R, Y- > R and Y- > Y), we used the following formula:where (dN/dS)and (dN/dS)denote the selection of R- > Y mutations and of all other types of mutations, respectively. We then compared the selection advantage of R- > Y mutation specific to UCD by calculating the ratio between selection advantages of UC dysregulated versus UC intact samples. We considered TCGA samples that have at least 20 mutations (to focus on the samples with sufficient nonsynonymous (N > 15) and synonymous (N > 5) SNPs), leading to 1,313 samples in 16 cancer types.

Detecting somatic mutations from whole exome-seq and RNAseq data Li et al., 2009 Li H.

Handsaker B.

Wysoker A.

Fennell T.

Ruan J.

Homer N.

Marth G.

Abecasis G.

Durbin R. 1000 Genome Project Data Processing Subgroup

The Sequence Alignment/Map format and SAMtools. McKenna et al., 2010 McKenna A.

Hanna M.

Banks E.

Sivachenko A.

Cibulskis K.

Kernytsky A.

Garimella K.

Altshuler D.

Gabriel S.

Daly M.

DePristo M.A. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Li et al., 2009 Li H.

Handsaker B.

Wysoker A.

Fennell T.

Ruan J.

Homer N.

Marth G.

Abecasis G.

Durbin R. 1000 Genome Project Data Processing Subgroup

The Sequence Alignment/Map format and SAMtools. To capture variants in the coding region, we downloaded exome-seq data of 18 individual breast cancer and matched normal samples from TCGA portal. For each read-alignment (BAM) file of normal and cancer we called variants using the GATK (V. 3.6) ‘HaplotypeCaller’ () utility with same hg38 assembly that the TCGA used for exome-seq mapping and applying ‘-ERC GVCF’ mode to produce a comprehensive record of genotype likelihoods for every position in the genome regardless of whether a variant was detected at that site or not. The goal of using the GVCF mode was to capture confidence score for every site represented in a paired normal and cancer cohort for calling somatic mutation in cancer. Next we combined the paired GVCFs from each paired cohorts using GATK’s ‘GenotypeGVCFs’ utility yielding genotype likelihood scores for every variant in cancer and the paired normal sample. Next, we used GATK’s ‘VariantRecalibrator’ utility using dbSNP VCF (v146: ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606_b146_GRCh38p2/VCF ) file by selecting annotation criteria of QD;MQ;MQRankSum;ReadPosRankSum;FS;SOR, followed by GATK’s ‘ApplyRecalibration’ utility with ‘SNP’ mode. Next, using GATK’s ‘VariantFiltration’ utility we selected the variants with VQSLOD > = 4.0. Finally, somatic mutations were defined as the loci whose genotype (1/1, 0/1, or 0/0 with ‘PL’ (Phred-scaled likelihood of the genotype) score = 0, i.e., highest confidence) in cancer is distinct from that in paired normal. The final somatic mutations were mapped on an exonic site of a transcript by ‘bcftools’ tool (V. 1.3) () using BED file of coding region in hg38 assembly. Barretina et al., 2012 Barretina J.

Caponigro G.

Stransky N.

Venkatesan K.

Margolin A.A.

Kim S.

Wilson C.J.

Lehár J.

Kryukov G.V.

Sonkin D.

et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. To call variants in RNA, we downloaded BAM files of RNA-Seq data for the same normal and cancer cohorts as above. First we used GATK’s ‘SplitNCigarReads’ utility to split the reads into exon segments and hard-clipped to any sequence overhanging into the intronic regions. Next, we used GATK’s ‘HaplotypeCaller’ utility using the same hg38 assembly that the TCGA used for RNA-Seq mapping. To reduce false positive and false negative calls we used ‘dontUseSoftClippedBases’ argument with the ‘HaplotypeCaller’ with minimum phred-scaled confidence threshold for calling variants set to be 20. We then filtered the variants using ‘VariantFiltration’ utility based on Fisher Strand values (FS > 30) and Qual By Depth values (QD < 2.0). Finally, we used each of the output VCF files for annotation of coding regions on the transcripts to which the variants were mapped by using ‘bcftools’ with BED file of coding region in hg38 assembly (see Data and Software Availability ). Based on this data, we compared PTMB in UC dysregulated (UCD-score > top 45%) versus UC intact (UCD-score < bottom 45%) samples using a Wilcoxon rank sum test. The same procedure was applied to identify the frequency of transversion mutations in cell lines with UC dysregulation followed by a Fisher’s exact test of R- > Y versus Y- > R mutation with background as their expected frequencies from CCLE () mutation data (file ‘CCLE_hybrid_capture1650_hg19_NoCommonSNPs_NoNeutralVariants_CDS_2012.05.07.maf’, n = 905). The same pipeline was used to analyze the exome-seq and RNAseq data from cell line experiments with human reference genome hg19.

Detecting somatic mutations from proteomics data To map the DNA variants to protein sequence, we downloaded peptide spectrum (PSM) data for 40 breast cancer samples, out of which only 4 samples overlapped with the samples analyzed for DNA mutations calls above. For each transcript in the somatic variant VCF file, we constructed complete coding sequence of RNA using GATK’s ‘FastaAlternateReferenceMaker’ utility. On this variant incorporated coding sequence, we captured codon that is affected by this variant site and in silico translated it into an amino acid; meanwhile, if the translated amino acid differs from reference amino acid we call it as ‘non-synonymous’ change and otherwise ‘synonymous’ (see Data and Software Availability ). Based on this data, we compared the overall R- > Y mutation-mapped amino acid changes in UC dysregulated (UCD-score > top 45%) versus UC intact (UCD-score < bottom 45%) samples using a Wilcoxon rank sum test.

Dynamic progression of PTMB after inducing UC-dysregulation Chen et al., 2015 Chen H.

Lin F.

Xing K.

He X. The reverse evolution from multicellularity to unicellularity during carcinogenesis. First, UCD-score and PTMB levels were calculated from the gene expression and mutation profiles from the mouse xenograft study where the evolutionary history of a mouse xenograft model of HRAS-mutated MCF10A was followed at eight different time points (). Since UCD levels increase initially (time point 1 is significantly lower than the UCD values in other points) and then stays at about similar levels throughout, we checked whether UCD-score at time point 1 lies far off from the distribution of UCD-score at all other time points (p < 0.016) and performed a correlation analysis between time course and PTMB levels. Second, we knocked-down ASS1 in U2OS osteosarcoma cells using shASS1 and overexpressed citrin using overexpressing vector in LOX melanoma cell-lines. We sequenced the genome at two time points (1 week and 2 weeks following UCD induction), and compared their PTMB levels. Dragen was used to carry out variant calling for these samples. All most all the called variants (> 99%) are biallelic. Dragen filters the variants based on some hard filter criteria. For SNPs, the filter criterion was QD < 2.0 || MQ < 30.0 || FS > 60.0 || MQRankSum < −12.5 || ReadPosRankSum < −8.0. For indel, the filter criterion was QD < 2.0 || ReadPosRankSum < −20.0 || FS > 200.0. After filtering based on the above criteria, the VQSR was performed over the variants. We compared their PTMB levels, where empirical p values were calculated by bootstrapping of 1,000 times.

Potential mechanism of PTMB To understand the potential mechanism by which pyrimidine-rich nucleotide imbalance induces PTMB, we investigated whether there is a correlation between PTMB and gene expression when UC enzymes are dysregulated via causal cell line experiments and via mining TCGA patient data. To this end we have induced UCD in 4 different cancer cell lines by perturbing UC enzymes (in the same manner as described Figure 4 B). After the induction of UCD, we checked that the perturbed cell lines show significant correlation between PTMB and gene expression ( Figure 5 C), while we confirmed that the UC-unperturbed cell lines do not. Furthermore, we have checked whether PTMB correlates with corresponding expression levels across TCGA UC-dysregulated samples both at DNA and RNA level, in which we binned the genes based on their expression levels and PTMB and median expression of each bin was considered for correlation analysis ( Figures S5 F and S5G). Haradhvala et al., 2016 Haradhvala N.J.

Polak P.

Stojanov P.

Covington K.R.

Shinbrot E.

Hess J.M.

Rheinbay E.

Kim J.

Maruvka Y.E.

Braunstein L.Z.

et al. Mutational strand asymmetries in cancer genomes reveal mechanisms of DNA damage and repair. Our analysis can be summarized that the pyrimidine-rich nucleotide imbalance (i) preferentially induces mutation on the DNA sense strand, (ii) biases mutation patterns from purine to pyrimidine resulting in the PTMB signature, and (iii) this type of mutational bias correlates with transcription levels. This cannot be explained via transcription-coupled repair, where the expression level is expected to anti-correlate with mutation rate. The closest explanation found in the literature is transcription-coupled damage, where the sense strand, which is left unprotected during transcription, accumulates mutations (). While the precise mechanism underlying PTMB requires further elucidation, a hypothetical model could be constructed based on current knowledge. Jinks-Robertson and Bhagwat, 2014 Jinks-Robertson S.

Bhagwat A.S. Transcription-associated mutagenesis. Genes undergoing high expression have been shown to be prone to mutagenic events (). This characteristic (so-called transcription-associated mutagenesis) likely stems from the fact that actively transcribed regions are more susceptible to hydrolytic decay or reactions with endogenous chemical species, such as oxygen radicals, due to the single-stranded nature of the transcription bubble. The sense strand of DNA is particularly at risk for damage accumulation, as it is unprotected by the nascent RNA transcript that hybridizes to the anti-sense (template) strand, the transcriptional apparatus or transcription-coupled repair that preferentially removes blocking lesions from the transcribed strand. While this feature is consistent with the mutagenic bias we observe (a) in regions of high gene expression and (b) on the sense strand, it does not by itself explain PTMB in the context of an imbalanced nucleotide pool. Berquist and Wilson, 2012 Berquist B.R.

Wilson 3rd, D.M. Pathways for repairing and tolerating the spectrum of oxidative DNA lesions. To explain how PTMB arises on this background, we further hypothesize that the increased level of DNA damage observed in highly transcribed regions is further processed for repair in an error-prone manner affected by the nucleotide imbalance. That is, presuming that most of the DNA damage generated would be through endogenous processes, such as spontaneous hydrolysis or oxidative reactions, the resulting modifications (i.e., uracil, abasic sites, 8-oxoguanine, cyclopurines, etc.) would call into action either base excision or nucleotide excision repair mechanisms (). Both repair systems rely on a re-synthesis step after damage removal, which due to an imbalance in the nucleotide pool, would be more prone to nucleotide mis-incorporation. The consequent mispairs might evade additional repair responses, namely mismatch repair, due to the robust replicative nature of cancer cells, although mismatch repair also relies on a re-synthesis step that would presumably be prone to incorporation errors as well. Thus, in short, increased transcription would lead to increased DNA damage, which would be processed incorrectly due to error-prone DNA repair events stemming from nucleotide pool imbalance, eventually leading to mutagenesis within actively transcribed genes on the sense strand following chromosome duplication.

Genome-scale metabolic network modeling ∑ j S i j v j = 0 , (5)

where the entry S ij represents the stoichiometric coefficients of metabolite i in reaction j, and v j stands for the metabolic flux vector for all reactions in the model. The model assumes steady metabolic state as represented in α j ≤ v j ≤ β j , (6)

where α j and β j defines the lower and upper bounds of the metabolic fluxes for different types of metabolic fluxes. (i) The exchange fluxes model the metabolite exchange of a cell with the surrounding environment via transport reactions, enabling a pre-defined set of metabolites to be either taken up or secreted from the growth media. (ii) Enzymatic directionality and flux capacity constraints define lower and upper bounds on the fluxes as represented in Duarte et al., 2007 Duarte N.C.

Becker S.A.

Jamshidi N.

Thiele I.

Mo M.L.

Vo T.D.

Srivas R.

Palsson B.O. Global reconstruction of the human metabolic network based on genomic and bibliomic data. Folger et al. (2011) Folger O.

Jerby L.

Frezza C.

Gottlieb E.

Ruppin E.

Shlomi T. Predicting selective drug targets in cancer through metabolic networks. We used genome-scale metabolic modeling to study the stoichiometric balance of nitrogen metabolism between urea production and pyrimidine synthesis. For a metabolic network with m metabolites and n reactions, the stoichiometric constraints can be represented by a stoichiometric matrix S,where the entry Srepresents the stoichiometric coefficients of metabolite i in reaction j, and vstands for the metabolic flux vector for all reactions in the model. The model assumes steady metabolic state as represented in Equation 5 , constraining the production rate of each metabolite to be equal to its consumption rate. In addition to the mass balance, a constraint-based model limits the space of possible fluxes in the metabolic network’s reactions through a set of (in) equalities imposed by thermodynamic constraints, substrate availability, and the maximum reaction rates supported by the catalyzing enzymes and transporting proteins,where αand βdefines the lower and upper bounds of the metabolic fluxes for different types of metabolic fluxes. (i) The exchange fluxes model the metabolite exchange of a cell with the surrounding environment via transport reactions, enabling a pre-defined set of metabolites to be either taken up or secreted from the growth media. (ii) Enzymatic directionality and flux capacity constraints define lower and upper bounds on the fluxes as represented in Equation 6 . We used the a human metabolic network model () with biomass function introduced inunder the Roswell Park Memorial Institute medium. Orth et al., 2010 Orth J.D.

Thiele I.

Palsson B.O. What is flux balance analysis?. To study the metabolic alterations occurring in UC dysregulated cancer cells (having increased growth and biomass production rates, and increased CAD activity versus healthy cells), we performed a flux-balance-based analysis (). We computed the nitrogen utilization by subtracting the total amount of nitrogen excreted from the amount of nitrogen up taken, while taking into account the nitrogen’s stoichiometry in all nitrogen-containing metabolites. We gradually increased the demand constraints for biomass production rates and the flux via the three enzymatic reactions of CAD – Carbamoyl-phosphate synthetase 2 (CPS2), Aspartate transcarbamylase (ATC), and Dihydroorotase - up to their maximal feasible values in the model.

Hydrophobic amino acid changes expected by purine to pyrimidine mutations Using the codon table, we considered the 4 mutation types with respect to R/Y (i.e., R- > Y, R- > R, Y- > Y, Y- > R) and the 4 types of amino acid (AA) changes with respect to hydrophobicity (N- > H, H- > H, N- > N, H- > N), where H denotes hydrophobic AA and N denotes non-hydrophobic AA. We counted each of the cases (4 by 4 contingency table) at the first, second, and third loci of each codon. The resulting (4 by 12) table is presented as Table S8 . We calculated the enrichment of R- > Y mutation in N- > H AA changes using a Fisher’s exact test.

Production and purification of membrane HLA molecules Cell line pellets were collected from 2x108 cells. The solution was mixed by gentle rotation in the cold for one hour with lysis buffer containing 0.25% sodium deoxycholate, 0.2mM iodoacetamide, 1mM EDTA, 1:300 Protease Inhibitors Cocktail (Sigma-Aldrich, P8340), 1mM PMSF and 1% octyl-b-D glucopyranoside in PBS. Samples were then incubated at 4°C for 1 hour. The lysates were cleared by centrifugation at 48,000 g for 60 minutes at 4°C, and then were passed through a pre-clearing column containing Protein-A Sepharose beads. Milner et al., 2013 Milner E.

Gutter-Kapon L.

Bassani-Strenberg M.

Barnea E.

Beer I.

Admon A. The effect of proteasome inhibition on the generation of the human leukocyte antigen (HLA) peptidome. Rappsilber et al., 2003 Rappsilber J.

Ishihama Y.

Mann M. Stop and go extraction tips for matrix-assisted laser desorption/ionization, nanoelectrospray, and LC/MS sample pretreatment in proteomics. HLA-I molecules were immunoaffinity purified from cleared lysate with the pan-HLA-I antibody (W6/32 antibody purified from HB95 hybridoma cells) covalently bound to Protein-A Sepharose beads. Affinity column was washed first with 10 column volumes of 400mM NaCl, 20mM Tris–HCl and then with 10 volumes of 20mM Tris–HCl, pH 8.0. The HLA peptides and HLA molecules were then eluted with 1% trifluoracetic acid followed by separation of the peptides from the proteins by binding the eluted fraction to disposable reversed-phase C18 columns (Harvard Apparatus). Elution of the peptides was done with 30% acetonitrile in 0.1% trifluoracetic acid (). The eluted peptides were then cleaned using C18 stage tips as described previously ().

Identification of eluted HLA peptides Ishihama et al., 2002 Ishihama Y.

Rappsilber J.

Andersen J.S.

Mann M. Microcolumns with self-assembled particle frits for proteomics. Cox and Mann, 2008 Cox J.

Mann M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. The HLA peptides were dried by vacuum centrifugation, solubilized with 0.1% formic acid, and resolved on capillary reversed phase chromatography on 0.075x300 mm laser-pulled capillaries, self-packed with C18 reversed-phase 3.5 μm beads (Reprosil-C18-Aqua, Dr. Maisch GmbH, Ammerbuch-Entringen, Germany) (). Chromatography was performed with the UltiMate 3000 RSLCnano-capillary UHPLC system (Thermo Fisher Scientific), which was coupled by electrospray to tandem mass spectrometry on Q-Exactive-Plus (Thermo Fisher Scientific). The HLA peptides were eluted with a linear gradient over 2 hours from 5 to 28% acetonitrile with 0.1% formic acid at a flow rate of 0.15μl/minute. Data was acquired using a data-dependent “top 10” method, fragmenting the peptides by higher-energy collisional dissociation. Full scan MS spectra were acquired at a resolution of 70,000 at 200 m/z with a target value of 3x10ˆ6 ions. Ions accumulated to an AGC target value of 10ˆ5 with a maximum injection time of generally 100 ms. The peptide match option was set to Preferred. Normalized collision energy was set to 25% and MS/MS resolution was 17,500 at 200 m/z. Fragmented m/z values were dynamically excluded from further selection for 20 s. The MS data were analyzed using MaxQuant () version 1.5.3.8, with 5% false discovery rate (FDR). Peptides were searched against the UniProt human database (July 2015) and customized reference databases that contained the mutated sequences identified in the sample by WES. N-terminal acetylation (42.010565 Da) and methionine oxidation (15.994915 Da) were set as variable modifications. Enzyme specificity was set as unspecific and peptides FDR was set to 0.05. The match between runs option was enabled to allow matching of identifications across the samples belonging the same patient. Shukla et al., 2015 Shukla S.A.

Rooney M.S.

Rajasagi M.

Tiao G.

Dixon P.M.

Lawrence M.S.

Stevens J.

Lane W.J.

Dellagatta J.L.

Steelman S.

et al. Comprehensive analysis of cancer-associated somatic mutations in class I HLA genes. Hoof et al., 2009 Hoof I.

Peters B.

Sidney J.

Pedersen L.E.

Sette A.

Lund O.

Buus S.

Nielsen M. NetMHCpan, a method for MHC class I binding prediction beyond humans. Nielsen and Andreatta, 2016 Nielsen M.

Andreatta M. NetMHCpan-3.0; improved prediction of binding to MHC class I molecules integrating information from multiple receptor and peptide length datasets. Janin, 1979 Janin J. Surface and inside volumes in globular proteins. HLA typing was determined from the WES data by POLYSOLVER version 1.0 (), and the HLA allele to which the identified peptides match was determined using the NetMHCpan v4.0 () (see Data and Software Availability ). The abundance of the peptides was quantified by the MS/MS intensity values, after normalization with the summed intensity of both UC-perturbed and control cell lines. We compared the abundance and hydrophobicity () of the peptides in UCD cell lines compared to controls using Wilcoxon rank sum test.

Cell cultures Patients fibroblast studies were performed on anonymized cells devoid of all identifiers. HepG2, SKOV3 and U2OS cell lines was purchased from ATTC. LOX-IMVI was purchased from NCI60 (RRID:CVCL1381). OTC and CPS1 deficient cell lines were purchased from Coriell Institute for Medical Research. Cells were cultured using standard procedures in a 37°C humidified incubator with 5% CO2 in Dulbecco’s Modified Eagle’s Medium (DMEM) (gibco) or RPM1 1640 medium (gibco) supplemented with 10%–20% heat-inactivated fetal bovine serum, 10% pen-strep and 2 mM glutamine. MC38 cell line, derived from mouse colon adenocarcinoma (kindly provided by Dr. Eran Elinav Department of Immunology, The Weizmann Institute of Science). MCF10A and its medium was kindly provided by Professor Yossi Yarden, Department of Biological Regulation, The Weizmann Institute of Science. All cells were tested routinely for Mycoplasma using Mycoplasma EZ-PCR test kit (#20-700-20, Biological Industries, Kibbutz Beit Ha’emek).

Crystal violet staining Cells were seeded in 6 or 12-well plates at 50,000-200,000 cells/well in a triplicate. Time 0 was calculated as the time the cells became adherent, which was after about 10 hours from plating. For each time point, cells were washed with PBS X1 and fixed in 4% PFA (in PBS). Cells were then stained with 0.2% (for MCF10A cells)-0.5% (other cell lines) Crystal Violet (Catalog #: C0775, Sigma-Aldrich) for 20 minutes (1ml per well) and washed with water. Cells were then incubated with 10% acetic acid for 20 minutes with shaking. Extract was then diluted 1:1-1:4 in water and absorbance was measured at 595 nm 24-72 hours following time 0.

5FU and Rapamycin treatment Survival analysis: LOX-IMVI, SKOV3, and Hepg2 (clones F10 and G03/4) perturbed cancer cells were seeded in 6-well plates at 50,000-200,000 cells/well cells per well. The following day, cells were treated with 10 μM 5FU (F6627, Sigma-Aldrich) 5FU was renewed into the medium every day for 3 days. Survival rate of cells was quantified using Crystal violet as described above. Nucleotide synthesis: LOX-IMVI, SKOV3,and Hepg2 (clone G03/4) perturbed cancer cells were seeded in 10cm plates The following day, cells were treated with 100 μM (SKOV3) −200 μM (LOX-IMVI and Hepg2) of Rapamycin (R0395, Sigma-Aldrich) or 10-20 μM 5FU (F6627, Sigma-Aldrich) 5FU and Rapamycin were renewed into the medium every day for 3 days Cells treated with Rapamycin or control (DMSO Vehicle) were extracted and processed for western blot analysis as describes below. Cells incubated with 5FU were extracted with Methanol and nucleotide synthesis levels were analyzed as described below.

Western blotting Cells were lysed in RIPA (Sigma-Aldrich) and 1% protease inhibitor cocktail (Calbiochem), 1% phosphatase inhibitor cocktail (P5726, Sigma-Aldrich). Following centrifugation, the supernatant was collected and protein content was evaluated by the Bradford assay or BCA Protein Assay Kit (ThermoFisher Scientific, cat # 23225) 100 μg from each sample under reducing conditions were loaded into each lane and separated by electrophoresis on a 10% SDS polyacrylamide gel. Following electrophoresis, proteins were transferred to Cellulose Nitrate membranes (Tamar, Jerusalem, Israel). Nonspecific binding was blocked by incubation with TBST (10 mM Tris–HCl (pH 8.0), 150 mM NaCl, 0.1% Tween 20) containing 5% skim milk or BSA 3% (Sigma, A7906) for 1h at room temperature. Membranes were subsequently incubated with antibodies against: p97 (1:10,000, PA5-22257, Thermo Scientific), GAPDH (1:1000, 14C10, #2118, Cell Signaling), CAD (1:1000, ab40800, Abcam and 1:500, Cell Signaling 11933), phospho-CAD (Ser1859) (1:1000, #12662, Cell Signaling), ASL(1:1000, ab97370, Abcam), Actin (1:1000, A5441, Sigma-Aldrich), OTC (1:1000, ab203859, Abcam), ASS1 (1:500, sc-99178 Santa Cruz), ORNT1 (1:500, NBP2-20387, Novusbio), Phospho-p70 S6 Kinase (Ser371) (1:1000 9208 Cell signaling), p70 S6 Kinase (1:500, Cell signaling 9202). Antibody was detected using peroxidase-conjugated AffiniPure goat anti-rabbit IgG or goat anti-mouse IgG (Jackson ImmunoResearch, West Grove, PA) and enhanced chemiluminescence western blotting detection reagents (EZ-Gel, Biological Industries). Gels were quantified by Gel Doc XR+ (BioRad) and analyzed by ImageLab 5.1 software (BioRad). The relative intensity of each band was calculated by dividing the specific band intensity with the value obtained from the loading control.

Immunohistochemistry Four micrometer paraffin embedded tissue sections were de-paraffinized and rehydrated. Endogenous peroxidase was blocked with three percent H2O2 in methanol. Sections undergoing for ASL, ORNT1 (SLC25A15), ASS1, OTC and PCNA staining we performed antigen retrieval in citric acid (pH 6), for 10 minutes or Tris EDTA (pH 9), using a low boiling program in the microwave to break protein cross-links and unmask antigens. After pre-incubation with 20% normal horse serum and 0.2% Triton X-100 for 1 hour at RT, biotin block via Avidin/Biotin Blocking Kit (SP-2001, Vector Laboratories, Ca, USA), sections were incubated with the primary antibodies as follow; ASL (1:50 dilution, Abcam, ab97370, CA, USA); ORNT1 (1:200 dilution, NBP2-20387, Novus Biologicals, CO, USA); ASS1 (1:50 dilution, Abcam, ab124465, CA, USA); OTC 1:100 LSBio,LS-C31865, WA, USA); PCNA (1:100. Santa Cruz, Ca, USA). All antibodies were diluted in PBS containing 2% normal horse serum and 0.2% Triton. Sections were incubated overnight at RT followed by 48h at 4°C. Sections were washed three times in PBS and incubated with secondary biotinylated IgG at RT for 1.5 hour, washed three times in PBS and incubated with avidin-biotin Complex (Elite-ABC kit, Vector Lab) at RT for additional 90 min followed by DAB (Sigma) reaction. Stained sections were examined and photographed by a bright field microscope (E600, Tokyo, Japan) equipped with Plan Fluor objectives (10x) connected to a CCD camera (DS-Fi2, Nikon). Digital images were collected and analyzed using Image Pro+ software. Images were assembled using Adobe Photoshop (Adobe Systems, San Jose, CA).

Virus infection Tirosh et al., 2015 Tirosh O.

Cohen Y.

Shitrit A.

Shani O.

Le-Trilling V.T.

Trilling M.

Friedlander G.

Tanenbaum M.

Stern-Ginossar N. The Transcription and Translation Landscapes during Human Cytomegalovirus Infection Reveal Novel Host-Pathogen Interactions. Primary fibroblasts were infected with HCMV and harvested at different times after infection for ribosome footprints (deep sequencing of ribosome-protected mRNA fragments) as previously described (). Briefly we infected human foreskin fibroblasts (HFF) with the Merlin HCMV strain and harvested cells at 5, 12, 24 and 72 hours post infection. Cells were pre-treated with Cylcoheximide and ribosome protected fragments were then generated and sequenced. Bowtie v0.12.7 (allowing up to 2 mismatches) was used to perform the alignments. Reads with unique alignments were used to compute footprints densities in units of reads per kilobase per million (RPKM). Cancer cells or MCF10A cells (kindly provided by Yossi Yarden lab, Department of Biological Regulation, The Weizmann Institute of Science) were infected with either pLKO-based lentiviral vector with or without the human OTC and SLC25A15, ASS1, GFP short hairpin RNA (shRNA) (Dharmacon). Infected cells were selected with 2-4 μg ml−1 puromycin.

RNA processing and quantitative PCR RNA was extracted from cells by using RNeasy Mini Kit (QIAGENe # 74104. cDNA was synthesized from 1 μg RNA by using qScript cDNA Synthesis Kit (Quanta #95749). Detection on cDNAs was performed using either SYBR green PCR master mix (Thermo Fisher scientific #4385612) or TaqMan Fast Advanced Master Mix (Thermofisher scientific #4444557), with the required primers. Primer sequences are summarized in the Star Methods and Key Resources Table

Transient transfection LOX-IMVI melanoma cells were seeded in 6-well plates at 70,000cells/ well, or in 12-well plates at 100,000cells/ plate. At the following day, cells were transfected with either 700pmol or 350pmol siRNA siGenome SMARTpool targeted to human SLC25A13 mRNA (M-007472-01, Dharmacon), respectively. Hepatocellular and ovarian carcinoma cells were seeded in 6-well plate at 10ˆ6 or 70,000cells/ well respectively, transfected with 2-3ug of the OTC (EXa3688-LV207 GENECOPOEIA) or ORNT1 (EXu0560-LV207 GENECOPOEIA) plasmids. Transfection was done with Lipofectamine® 2000 Reagent (#11668027, ThermoFisher Scientific), in the presence of Opti-MEM® I Reduced Serum Medium (#11058021, ThermoFisher Scientific). 4 hours after transfection, medium was replaced and the experiments were performed 48-108 hours post transfection.

Overexpression LOX-IMVI melanoma cells were transduced with pLEX307-based lenti viral vector with or without the human SLC25A13 transcript, encoding for citrin. Transduced cells were selected with 2μg/ml Puromycin.

Metabolomics analysis Urea and uracil Each cell line was seeded at 3- 5 × 106 cells per 10 cm plate and when confluent, washed with ice-cold saline, lysed with a mixture of 50% methanol in water added with 2 μg/mL ribitol as an internal standard and quickly scraped followed by three freeze-thaw cycles in liquid nitrogen. The insoluble material was pelleted in a cooled centrifuge (4°C) and the supernatant was collected for consequent GC-MS analysis. Samples were dried under air flow at 42°C using a Techne Dry-Block Heater with sample concentrator (Bibby Scientific) and the dried samples were treated with 40 μl of a methoxyamine hydrochloride solution (20 mg ml−1 in pyridine) at 37°C for 90 min while shaking followed by incubation with 70 μl N,O-bis (trimethylsilyl) trifluoroacetamide (Sigma) at 37°C for an additional 30 min. Isotopic labeling Hepatocellular and ovarian carcinoma cells were seeded in 10 cm plates and once the cell confluence reached 80%, cells were incubated with 4mM L-GLUTAMINE, (ALPHA-15N, 98%, Cambridge Isotope Laboratories, Inc.) for 24 hours. Subsequently, cells were processed as described above. GC-MS analysis used a gas chromatograph (7820AN, Agilent Technologies) interfaced with a mass spectrometer (5975 Agilent Technologies). An HP-5ms capillary column 30 m × 250 μm × 0.25 μm (19091S-433, Agilent Technologies) was used. Helium carrier gas was maintained at a constant flow rate of 1.0 mL min−1. The GC column temperature was programmed from 70 to 150°C via a ramp of 4°C min−1, 250-215°C via a ramp of 9°C min−1, 215-300°C via a ramp of 25°C min−1 and maintained at 300°C for an additional 5 min. The MS was by electron impact ionization and operated in full-scan mode from m/z = 30-500. The inlet and MS transfer line temperatures were maintained at 280°C, and the ion source temperature was 250°C. Sample injection (1 −3 μl) was in splitless mode.

Nucleotides analysis Materials Ammonium acetate (Fisher Scientific) and ammonium bicarbonate (Fluka) of LC-MS grade were used. Sodium salts of AMP, CMP, GMP, TMP and UMP were obtained from Sigma-Aldrich. Acetonitrile of LC grade was supplied from Merck. Water with resistivity 18.2 MΩ was obtained using Direct 3-Q UV system (Millipore). Extract preparation The obtained samples were concentrated in speedvac to eliminate methanol, and then lyophilized to dryness, re-suspended in 200 μl of water and purified on polymeric weak anion columns (Strata-XL-AW 100 μm (30 mg ml−1, Phenomenex)) as follows. Each column was conditioned by passing 1 μλ of methanol, then 1 μλ of formic acid/methanol/water (2/25/73) and equilibrated with 1 μλ of water. The samples were loaded, and each column was washed with 1 μλ of water and 1 mL of 50% methanol. The purified samples were eluted with 1 μλ of ammonia/methanol/water (2/25/73) followed by 1 μλ of ammonia/methanol/water (2/50/50) and then collected, concentrated in speedvac to remove methanol and lyophilized. Before LC-MC analysis, the obtained residues were re-dissolved in 100 μl of water and centrifuged for 5 min at 21,000 g to remove insoluble material. LC-MS analysis The LC-MS/MS instrument consisted of an Acquity I-class UPLC system (Waters) and Xevo TQ-S triple quadrupole mass spectrometer (Waters) equipped with an electrospray ion source and operated in positive ion mode was used for analysis of nucleoside monophosphates. MassLynx and TargetLynx software (version 4.1, Waters) were applied for the acquisition and analysis of data. Chromatographic separation was done on a 100 mm × 2.1 mm internal diameter, 1.8-μm UPLC HSS T3 column equipped with 50 mm × 2.1 mm internal diameter, 1.8-μm UPLC HSS T3 pre-column (both Waters Acquity) with mobile phases A (10 mM ammonium acetate and 5 mM ammonium hydrocarbonate buffer, pH 7.65 adjusted with 10% acetic acid) and B (acetonitrile) at a flow rate of 0.3 mL min−1 and column temperature 25°C. A gradient was used as follows: for 0-3 min the column was held at 0.2% B, then 3-3.5 min a linear increase to 100% B, 3.5-4.0 min held at 100% B, 4.0-4.5 min back to 0.2% B and equilibration at 0.2% B for 2.5 min. Samples kept at 8°C were automatically injected in a volume of 3 μl. For mass spectrometry, argon was used as the collision gas with a flow of 0.10 mL min−1. The capillary voltage was set to 2.50 kV, source temperature 150°C, desolvation temperature 400°C, cone gas flow 150 l hr−1, desolvation gas flow 800 l hr−1. Lee et al., 2014 Lee Y.Y.

Li C.F.

Lin C.Y.

Lee S.W.

Sheu M.J.

Lin L.C.

Chen T.J.

Wu T.F.

Hsing C.H. Overexpression of CPS1 is an independent negative prognosticator in rectal cancers receiving concurrent chemoradiotherapy. Nucleotide concentration was calculated using a standard curve of the relevant nucleotide concentration in each sample. Standard curves included increasing concentration of all measured nucleotides ranging from 0-10ug/ml that were positioned at the beginning and at the end of each run. All the calculated values for the different nucleotides in each sample fell within the standard curve range. Analytics were detected in positive mode using multiple-reaction monitoring listed in

Patient samples Royo et al., 2016 Royo F.

Zuñiga-Garcia P.

Torrano V.

Loizaga A.

Sanchez-Mosquera P.

Ugalde-Olano A.

González E.

Cortazar A.R.

Palomo L.

Fernández-Ruiz S.

et al. Transcriptomic profiling of urine extracellular vesicles reveals alterations of CDH3 in prostate cancer. All prostate patients’ urine samples were obtained upon informed consent and with evaluation and approval from the corresponding ethics committee (CEIC code OHEUN11-12 and OHEUN14-14) (). Patients included in the study were diagnosed with prostate adenocarcinoma and the criteria for inclusion was to be scheduled for surgery as anticancer treatment. Samples were collected between 2012 and 2016. Urea analysis: To estimate age-stratified Urea background levels we pooled data from X = 1,363,691 patients in the de-identified Clalit Health Services electronic health record (code 0194-17-COM2), and y = 100 pediatric cancer patients on their day of admission to the Pediatric Hemato-Oncology Department at Souraski Medical Center (code 0016-17). Median urea level was computed per sample per year and values were grouped by age and the distribution was summarized in a boxplot. Cases were analyzed in a similar fashion. P values were estimated using a MW test that was performed following additional stratification to gender.

Extraction of polar metabolites from urine and plasma To extract polar metabolites from urine (20∼100 uL) and plasma (100 uL) samples, 1 and 0.9 mL methanol (with labeled amino acids as internal standard) were added, respectively, into biological sample-containing Eppendorf tube. Then, the resulting mixture was vortexed and sonicated for 15 min, vortexed again and centrifuged at 14000 rpm for 10 min. The liquid phase was transferred into new tube and lyophilized. Then the pellets were dissolved using 150 uL DDW-methanol (1:1), centrifuged twice to remove possible precipitants, and was injected into LC-MS system. LC-MS polar metabolites analysis Zheng et al. (2015) Zheng L.

Cardaci S.

Jerby L.

MacKenzie E.D.

Sciacovelli M.

Johnson T.I.

Gaude E.

King A.

Leach J.D.

Edrada-Ebel R.

et al. Fumarate induces redox-dependent senescence by modifying glutathione metabolism. Metabolic profiling of polar phase was done as described atwith minor modifications described below. Briefly, analysis was performed using Acquity I class UPLC System combined with mass spectrometer (Thermo Exactive Plus Orbitrap) which was operated in a negative ionization mode. The LC separation was done using the SeQuant Zic-pHilic (150 mm × 2.1 mm) with the SeQuant guard column (20 mm × 2.1 mm) (Merck). The Mobile phase A: acetonitrile and Mobile phase B: 20 mM ammonium carbonate plus 0.1% ammonia hydroxide in water. The flow rate was kept at 200 μl min−1 and gradient as follow: 0−2min 75% of B, 17 min 12.5% of B, 17.1 min 25% of B, 19 min 25% of B, 19.1 min 75% of B, 19 min 75% of B. Polar metabolites data analysis The data processing was done using TraceFinder Thermo Fisher software were detected compounds were identified by Retention time and fragments and verified using in-house mass spectra library. Urine metabolites were normalized by creatinine peak area.