Selection of long-lived strains and life-extending interventions

Several mutations leading to exceptional longevity of C. elegans have been identified1,14,15,16,17,18,19 and studied extensively for their remarkable elevations of both lifespan and stress resistance1,2. We focused on the most long-lived isogenic C. elegans strains, carrying mutations in a long-lived wild-type (Bristol-N2 DRM) background: daf-2(e1370) [strain SR806], age-1(mg44) [SR808, at the first and second generations of homozygosity], and the longest-lived daf-2(e1391); daf-12(m20) double mutant [strain DR1694]. The average lifespans in the series range from twofold to nearly tenfold longer than that of the wild type.

To assess whether the same transcriptomic signature reflects both genetic and later-onset epigenetic life extension in adult nematodes, we chose five target genes for RNAi inactivation; daf-4, che-3, cyc-1, cco-1, and eat-4. Mutation or RNAi disruption of most of them had been reported to prolong life-spans: +40–120% by daf-4(e1364)14; +37% by che-3(e1124) and +100% by che-3(p801)15; +87%20, +60–100%21 and +110%22 by cyc-1 RNAi; +61%20 and +57–80% by cco-1 RNAi21; whereas the effect of eat-4 RNAi has not been quantified in the context of aging yet.

We confirmed the longevity of worm strains subjected to these five RNAi interventions at 20 °C (see Table 1 and Fig. 1). The extensions of mean lifespan ranged from 5–16%, often well below those previously reported, which was likely due to the different mode of action (mutation or RNAi treatment), or variation in the study protocols, including RNAi-efficacy, exposure times and/or culture temperatures. Neuronal resistance to RNAi may also contribute, and could also impinge on non-neuronal tissues through an inter-tissue feedback loop23,24,25. Current and previous lifespan data are summarized in Table 2. The most substantial relative effect obtained by RNAi corresponds to an increase in mean lifespan from 20.1 days to 23.3 days (+16.2%) by RNAi of che-3.

Table 1 Summary of survivals for the mutant strains and the confirmatory RNAi interventions. Full size table

Figure 1 Confirmatory survivals for five groups of N2 wild-type worms treated with life-prolonging RNAi or bacteria carrying only the empty feeding vector (FV) without an RNAi insert. Markers and lines are colored according to the mean lifespan of each group (see Table 1 for the summary of survivals). Full size image

Table 2 Summary of survivals for the mutant strains and the confirmatory RNAi interventions. Full size table

Age-dependent transcriptomes of long-lived C. elegans strains

The age-dependent RNA-seq experimental dataset consists of the four mutant groups (daf-2(e1370), age-1(mg44) [at the first and second generations of homozygosity], and daf-2(e1391); daf-12(m20) double mutant), the three examples of life-extending RNAi (daf-4, che-3 and cyc-1, representing knockdown of diverse pathways) and two independent control runs represented by C. elegans wild-type (Bristol-N2, strain DRM) from Table 1; 60 transcriptomes in total (see Methods for RNA-seq data processing details).

We started by performing an exploratory analysis of all the gene-expression data with the help of principal component analysis (PCA). The first principal component (PC1), along which the variance of the data is maximal, is the only component significantly correlated with age (r = 0.75, p < 10−10, accounting for r2 = 56% of total variance). PC1 simultaneously arranges the mutants, Fig. 2(a), and the RNAi treated strains, Fig. 2(b), according to their respective values of chronological age. The total amplitude of change from youngest to oldest ages is approximately the same for all 9 groups despite their wide range of longevities.

Figure 2 Principal components analysis (PCA) of the experimental RNA-seq datasets for (a) four long-lived mutants and C. elegans wild-type (Bristol-N2, strain DRM), and (b) three life-prolonging RNAi-treated groups and C. elegans FV controls (fed bacteria harboring empty feeding-vector plasmid). The marker type denotes strains and RNAi groups (see Table 1 for lifespans). The markers in (a,b) are colored according to age rescaled to the average lifespan in the groups: 0 and 1 correspond to the L4/adult molt and the mean adult lifespan respectively (see the color bar). Full size image

Our gene expression data suggest that the aging signature, i.e., the set of genes transcriptionally associated with age, is robust and remains consistent under a reasonably broad range of experimental conditions and genetic or epigenetic interventions. No other principal component score has a statistically significant correlation with age after correction for multiple comparisons. It is therefore plausible to assume that the state of the organism concerning development and aging can be characterized by a single number, such as PC1 score, indicating normalized biological age. This, however, by no means implies that the system state dynamics along the first principal component alone can explain all transcript-level variation caused by altered biology, mutations or gene silencing. Therefore, the positions of the strain-representing markers in Fig. 2(a,b) can fluctuate along PC1 or in orthogonal directions.

Meta-analysis of aging dynamics in C. elegans transcriptomes

Small-scale experiments, including ours, yield a massive number of transcript levels, measured in a relatively small number of samples. The accurate PCA inference of the aging signature and the biomarker of age is challenging due to the lack of consistency of PCA in high dimensions26,27. The procedure is only appropriate for exploratory purposes since it is prone to over-fitting and false-positive errors even if all of the samples are collected in the same laboratory under the same conditions. A direct comparison of gene expression data obtained in different laboratories is further complicated by divergence in experimental procedures, leading to uncontrollable batch effects requiring extensive normalization28,29,30.

To address this hurdle, we proposed that a robust transcriptomic biomarker of age could also be obtained from a sufficiently large collection of publicly deposited “shallow” datasets from small, independent experiments (a dozen samples each, on average) since all the transcriptomes would have to share the same essential biology of aging. In contrast, the specific experimental conditions and uncontrollable batch differences should be mostly uncorrelated, and thus would comprise “noise” in a combined dataset of sufficient overall size.

To investigate such a possibility, we compiled a comprehensive transcriptomic collection for C. elegans by combining almost all publicly available gene-expression data for aging cohorts into a single database. The resulting “MetaWorm” dataset contains, in total, more than 400 different transcriptomic experiments with N = 3724 nematode samples characterized by G = 4861 of the most commonly expressed/detected genes in the samples (see Methods for more detail on the composition of the dataset and its normalization).

The gene-expression variance in the combined MetaWorm dataset is dominated by batch effects, and hence we do not expect PCA to reveal aging in association with the first principal component in an entirely unsupervised way. Instead, we attempted to identify the aging signature by testing many individual genes for differential expression during aging. Our MetaWorm dataset is sufficiently large to generate the cross-validation ensemble of single-gene association tests using exhaustive random resampling. We further reduced the number of candidate genes by thresholding the transcripts based on the frequency of significant associations in the resampling; we estimate that the chosen cross-validation threshold corresponds to p < 10−6 uncorrected, or p < 0.005 after Bonferroni correction. The final list of genes robustly associated with aging in the MetaWorm study consists of 327 genes (7% of all genes in MetaWorm): 260 up- and 67 down-regulated with age. We suggest using this gene set as the transcriptomic signature of aging. It is noteworthy that approximately 4000 out of 4861 genes never showed a significant association with aging during the resampling (see Electronic Supplementary Materials for the full list of genes implicated in aging in our calculations).

The transcriptomic signature of age may not be exhaustive, and yet by design, it was reproducible across independent experiments and hence should be useful for future C. elegans aging studies. In our experimental RNA-seq dataset, for example, 902 genes are significantly associated with age rescaled by lifespan (for the same threshold as for MetaWorm, p < 0.005 after Bonferroni correction) out of 4861 genes most commonly detected in the MetaWorm samples. Even though the selection using Bonferroni correction is conservative, the list of significant gene-associations in our dataset is larger than for MetaWorm. The “extra” genes in our list may reflect an association with any laboratory-specific external parameter changing monotonically in time; meta-analysis of data obtained under different external conditions in different laboratories would strongly suppress such age-associations, thus leaving a smaller number of significant hits. There may also be strong correlations among genes governed by a transcription factor whose abundance varies with time within any one laboratory, but which may be lost from the MetaWorm database due to inter-laboratory variation in its induction. The prominence of transcription factors among the genes that are age-dependent would inevitably lead to a gene-set with high internal cross-correlation, and a far higher-than-expected fraction of age-associated genes. The MetaWorm list of 327 candidate is congruent to the list of 902 genes from our experimental dataset, since the Mann-Whitney U test shows that 327 MetaWorm candidate genes are significantly enriched with the genes having the most significant correlation with age rescaled by lifespan among 902 most significant ones in our RNA-seq data. The corresponding area-under-curve (AUC) statistic for the receiver operating characteristic (ROC) curve is AUC = 0.610 ± 0.015, at the significance level p < 10−30. This implies that the MetaWorm set of “aging signature” genes very likely includes the same genes that determined PC1 in our RNAseq data, among many more co-varying (and hence partially redundant) genes with age-dependent expression.

A correlation with age does not necessarily imply a causal relation to aging, yet genes correlated with age are usually the primary target in aging studies. As a first approach to inference of the regulators of aging, we checked whether the transcriptomic signature of aging is enriched for the targets of known gene-expression regulators (see Table 3). We used four databases for the enrichment analyses: a curated database for transcription factors and RNA-binding proteins from published high-throughput expression studies in C. elegans, WormExp31; a high-quality protein-DNA interaction network32; and two databases of miRNA-target interactions: the in silico predicted TargetScan33 and the experimentally validated MirTarBase34. Enrichment analysis of the list detected ten hits already experimentally characterised as regulators of aging: DAF-1635, ELT-236, ELT-637, PMK-138,39, PQM-140, NHR-1, NHR-10, NHR-8641, let-742,43, and miR-6044.

Table 3 Transcription factors, RNA-binding proteins and miRNAs whose targets are significantly enriched in the list of potential transcriptomic biomarkers of age (327 genes). Full size table

Universal transcriptomic biomarker of age

The robust nature of the gene-expression signature of age across widely varying experimental conditions suggests that at any given time, the organism’s aging state can be characterized by a single number representing biological age. A typical approach for biological age modeling relies on linear regression of measured parameters, gene expression in our case, on chronological age. Naturally, due to the high internal cross-correlation among the gene expression levels and a limited number of samples, the multivariate regression problem is ill-defined, and any number of convenient biomarkers of age could be obtained by applying additional constraints to the regression problem. In this work we impose a requirement of sparsity on the transcriptomic biomarker of age, i.e., the number of genes used in the biomarker model should be minimized while preserving its predictive power. This is possible to do by performing a cross-validated lasso regression of gene expression in the MetaWorm dataset on age rescaled by lifespan in the transcriptomic signature of aging comprising 327 genes. To ensure that the obtained transcriptomic biological age model is not over-fitted and hence retains its predictive power, we have not used our experimental RNA-seq data during training. The final version of the sparse transcriptomic biological age predictor comprises the contributions of only 71 genes (see Electronic Supplementary Materials for the list of regression weights).

Simultaneous temporal scaling of survival curves and aging trajectories

The biological age predictor can now be used to transform our multi-dimensional experimental RNA-seq data representing every sample as a single number, the biological age. In Fig. 3(a,b) we plotted the aging trajectories (the dependence of the biological age on chronological age) and the survival plots. We choose to plot the age-dependent quantities not as a function of age, but as a function of age rescaled by lifespan, in contrast to Fig. 4(a,b) where the same data are plotted as a function of age without rescaling. The scaling transformation works exceptionally well and simultaneously brings together the survival curves and the aging trajectories of animals with drastically different average adult lifespans: from 17 days for wild-type N2-DRM control worms to 160 days for age-1(mg44) mutants (the Pearson correlation of the biological age with age rescaled by lifespan is r = 0.86 (p = 2 · 10−18), cf. r = 0.54 (p = 8 · 10−6) for the correlation of the biological age with chronological age (see Fig. 4(a)).

Figure 3 The temporal scaling of survival curves and aging trajectories. (a) The aging trajectories as a function of age rescaled by lifespan for the in-house collection of gene-expression data for four long-lived mutants and C. elegans wild-type (Bristol-N2, stock DRM), and three groups of N2 wild-type worms treated with life-prolonging RNAi or bacteria carrying only the empty feeding vector (FV) without an RNAi insert. The dashed line is a guide-to-eye. (b) The survival curves from (a) as a function of age rescaled by lifespan. In both panels, all markers are colored according to the lifespans of the strains (see Table 1 for the lifespans): red for small and green for large lifespans. Overall, the scaling spans almost tenfold variation in median adult lifespan, ranging from 17 to 160 days. Full size image

Figure 4 The survival curves and the aging trajectories as a function of age with no rescaling. (a) The aging trajectories as a function of age for the in-house collection of gene-expression data for four long-lived mutants and C. elegans wild-type (Bristol-N2, stock DRM), and three groups of N2 wild-type worms treated with life-prolonging RNAi or bacteria carrying only the empty feeding vector (FV) without an RNAi insert. (b) The corresponding survival curves from (a) as a function of age. In both panels, all markers are colored according to the lifespans of the strains (see Table 1 for the lifespans): red for small and green for large lifespans. Overall, the adult lifespan ranges from 17 to 160 days. Full size image

It is worth noting that the biological age naively inferred from the small dataset as a first principal component score from Fig. 2(a,b) is not sufficiently accurate to reveal the temporal scaling of the aging trajectories in the same experiment.

Mortality deceleration

The scaling property of the aging trajectories and the survival functions can be naturally explained using the “aging-at-criticality” model, providing a coarse-grained description of the biological age variable and gene expression dynamics with the help of a simple stochastic Langevin equation and allowing an analytic solution for mortality and the survival fraction12.

Early in life, up to approximately the average lifespan, mortality increases exponentially with age, M(t) ≈ M 0 exp (αt), where M 0 is the initial mortality rate (IMR). The Gompertz exponent α is determined by the regulatory network stiffness and is inversely proportional to the mortality rate doubling time (see Methods for a summary of the model). We predicted, however, that the exponential rise in mortality rates would cease at late ages, approaching a plateau determined at the value of α12.

High-quality mortality records13 were used to test the theoretical prediction. In that study, nematodes were subjected to various life-shortening stresses and had their lifespans reduced in such a way that any two survival curves could be transformed one into another by a simple rescaling of age. We computed the approximate values of the mortality rate doubling exponent using the data in mid-life and the mortality plateau estimates later in life for all the reported conditions (see Methods for details of the calculations). The results, summarized in Fig. 5, demonstrate a remarkably tight correlation between the parameters, in good semi-quantitative agreement with the theoretical calculation, across a life-span range of almost two orders of magnitude.

Figure 5 The plateau mortality \(M(t\gg \bar{t})\) versus the Gompertz exponent α calculated from the experimental C. elegans survival curves13. The predicted behavior is shown by the dotted line. The values of the average lifespan \(\bar{t}\) are depicted by the pseudo-color: red for small and green for large values. Full size image

Using the signature of aging to identify novel life-extending pharmacological interventions

The results presented so far confirm our conjecture of an association between aging and critical dynamics of the underlying regulatory network. Aging appears to be a consequence of intrinsic instability manifesting itself as lack of dynamic control over the expression of genes comprising the signature of aging. We therefore predicted that interventions exerting perturbations opposing the aging change in the animals would reduce the rate of aging and extend lifespan.

To illustrate the predictive value of our gene-expression signature of age, we attempted to identify novel life-extending pharmacological interventions by comparing the signature of aging from this study with gene expression profile changes in response to pharmacologic perturbations from the Connectivity Map (CMAP) database45,46. This is no trivial task since most of the available transcriptomic studies represent the results of experiments characterizing the effects of drug compounds in human cancer cell strains. We transformed the list of genes associated with aging in C. elegans into the form recognizable by CMAP and obtained a list of prospective medicines with gene expression signatures opposing the aging direction and thus presumably capable of reversing the progression of aging in these animals (see Methods for the necessary details). A similar approach for drug repurposing against aging has been already demonstrated, e.g., using human brain tissue transcriptomics as the input47 and in48 We observed, however, that the list of the predicted compounds turned out to be sensitive to minor variations in our worm-to-human gene conversion pipeline and the difference between the latest CMAP versions49.

From the top-10 list of predicted compounds (see Table 4) we selected 5 drugs, spanning the entire range of p-values: anisomycin, camptothecin, alsterpaullone, azacytidine and metamizole. Because camptothecin did not pass our initial screen for a sparing effect on aggregation (see Methods), it was not tested for lifespan effects. The other four compounds were assessed in the C. elegans lifespan assay using two concentrations (1 μM and 10 μM) at 20 °C; see Table 5 for a summary of the experimental results. All four compounds turned out to be more effective at the lower concentration, which suggests toxicity at the higher dose, probably due to off-target effects. In Fig. 6, we show the survival curves in the respective cohorts. Remarkably, temporal rescaling of survival curves accounts for all variation in survival of the stocks treated with these drugs at both concentrations (compare Fig. 6 to 7 and 8).

Table 4 A list of top-10 compounds predicted by CMAP to reverse the progression along the aging direction. Full size table

Table 5 Summary of survivals for the pharmacological interventions. Full size table

Figure 6 Pharmacological interventions: survivals and their temporal scaling. (a) The survival curves for Bristol-N2 DRM stocks subjected to pharmacological treatments with alsterpaullone, anisomycin, azacytidine, and metamizole at 1 μM. (b) The survival curves from (a) as a function of age rescaled by lifespan. In both panels, all markers are colored according to the measured lifespans of the strains (see Table 5) from red to green for the smaller and for the larger values of lifespans, respectively. Full size image

Figure 7 Pharmacological interventions: survivals and their temporal scaling. (a,c) The survival curves for Bristol-N2 DRM stocks subjected to pharmacological treatments with alsterpaullone, anisomycin, azacytidine and metamizole at 10 μM dose. (b,d) The survival curves from (a,c) respectively, as a function of age rescaled by lifespan. In both panels, all markers are colored according to the lifespans of the strains (see Table 5 for the lifespans): red for small and green for large lifespans. Full size image