Ian T Baldwin Senior Editor; Max Planck Institute for Chemical Ecology, Germany Jessica C Thompson Senior and Reviewing Editor; Emory University, United States In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Personality links with lifespan in chimpanzees" for consideration by eLife. Your article has been reviewed by three peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Ian Baldwin as the Senior Editor. The reviewers have opted to remain anonymous.

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

The authors report a study that links personality with lifespan of a large sample of captive chimpanzees. They rely on a questionnaire method for assessing personality along six dimensions, and they use those data to find a relationship between personality and longevity. They find that for males longevity is related to agreeableness and for females to openness, and they discuss these results in terms of phylogenetic assumptions about human and great ape evolution.

Essential revisions:

Whereas the results are interesting, the reviewers and Reviewing Editor all have several major concerns with this study that cast doubt on the robustness of the theoretical framing, the methods, and as a consequence, the conclusions of the paper. A major revision would be required in order for publication in eLife, but as most of these involve incorporation of key discussion, literature, and modifications to the analytical approach, we feel these are feasible to accomplish within a two-month time frame. If you choose to revise and resubmit along the lines below, this will be a much stronger paper, and can be considered for publication in eLife. If these revisions are undertaken, it will have the potential to contribute not only an important dataset, but also to address some key theoretical problems in this area of research more broadly.

Below are the major issues/points that must be addressed in order to be considered in revised form. Large sections from the original reviews have been pasted into these comments, as they contain many helpful suggestions about theoretical framing, literature, and approach. Although each point is described at length, the substance of the revision should not simply be a longer version of the existing manuscript. Instead, it should be only a slightly longer version that revises much of the background and discussion in light of the valuable insights from the reviewers below.

1) Title. Given some of the uncertainties in the data and conclusions that are raised below, the title should be modified to ensure it faithfully reflects these issues.

2) Phylogenetic assumptions. The central framing of the problem with respect to human evolution (Introduction, fourth paragraph) makes a standard phylogenetic assumption: that behavioral attributes found in both chimpanzees and humans should also have been shared by the last common ancestor. There is much debate over the utility of this approach for behavioral attributes, or if it might suffer overmuch from homoplasy – especially when there appear to be so many differences between chimpanzees and bonobos, with both equally related to humans (work by Sayers is relevant here). The authors should include discussion and literature that explains why here they make the claim so strongly, or why they claim to pinpoint the evolution of specific male and female personality attributes to such a specific period of time (since the split with hominins). This is especially the case when their research question (as currently states) prioritizes similarities between humans and chimpanzees (rather than bonobos). The discussion that leads to this point also appears to somewhat answer the question in advance, because it is clear that the authors are already arguing that a link between personality and fitness is ancestral in primates (and therefore will also be linked in chimpanzees, humans, presumably bonobos, and their last common ancestor). This argument seeps back in later with the discussion of gorillas (Discussion, third paragraph), where they are referred to rather oddly (considering the phylogenetic argument up to this point and the actual genetic relatedness of gorillas and chimps versus chimps and bonobos or chimps and humans) as "close chimpanzee cousins". Thus, the phylogenetic argument should be very carefully constructed: what traits do they think are derived in chimpanzees, what traits ancestral to both chimpanzees and humans, what is the rationale, etc.?

3) Use of literature. The paper opens with a strong background discussion, but reads like a bank of examples rather than a coherent lead-up to a clear set of hypotheses (more on this below). Each cited study shows a suggestive tendency for proxies of fitness to link with some aspect of personality, but there is clear diversity within primates regarding how each of these relationships actually plays out (as well as in what fitness measures are used). This background captures some of the ambiguity and gaps in current knowledge, but it does not confront them head-on. The paper would increase its impact if it clearly describes the areas where there is more surety than others, and specifically where increased work is necessary and why. There is also a near-complete lack of citations to the non-primate animal personality literature. It is generally disconcerting that the primate personality and non-primate personality literatures don't often cite each other, but it is especially puzzling here since researchers working with short-lived, easily manipulated species like birds, fish, or insects, have much better means of studying links between behavioral differences and fitness outcomes, and can add a solid empirical basis to the theoretical framework needed in this study (see below).

4) Questionnaire method. There are some serious concerns about the questionnaire approach to assessing personality, but we recognize this is an approach that is used by some researchers and that the justification simply must be more robustly presented. Perhaps what is most problematic is that in this manuscript, the authors present this method as the method, thereby ignoring an enormous field of (ecological) studies that instead use an ethological approach and code actual behavior, or conduct experiments to test for consistency of behavior across time and context. Additionally, this field has proposed actual informed hypotheses about how personality influences life-history traits. The fact that the authors do not even refer to any of the work done by, to name only a few, Dingemanse, Réale and Sih, while talking about animal personality in light of evolution, is extremely problematic. Also, such studies do exist for chimpanzees; e.g. Uher and Asendorf 2008; Koski 2011; Massen et al., 2013, and also this should be acknowledged. Personality is defined as "inter-individual differences that are consistent over time and context" (something that the authors do not mention), and whereas they report here on inter-individual differences, they do not report anything about it being consistent or repeatable. It is common (and good) practice to use a test-retest design to check for such consistency, yet in primatology and especially when testing apes, this seems to be deemed unnecessary. Yet, as mentioned, this is how animal personality is defined and thus is very important. These authors (and others) tend to use the high inter-rater reliability they find as an argument against this. However, this is not the same as temporal consistency, and as mentioned before, this inter-rater reliability doesn't result from independent raters. Zoo keepers (and researchers alike) talk about their animals, and thus inadvertently but unavoidably, influence each other’s perception of, and consequently the ratings of, these animals. Further, all the references the authors use to validate the use of the rating method in the Materials and methods section (Weiss et al., 2009; King and Figueredo, 1997; Weiss, King and Hopkins, 2007; Herrelko, Vick and Buchanan-Smith, 2012; Vazine et al., 2007; Weiss, King and Figueredo, 2000; Wilson et al., 2017; Latzman et al., 2015; King, Weiss and Farmer, 2005; Weiss et al., 2012; Pederson, King and Landau, 2005; Latzman et al., 2015; Blatchley and Hopkins, 2010 –) are of people involved in this study, and thus not independent. In short, serious discussion needs to be undertaken to ensure the reader is aware that the questionnaire approach has its detractors, and the basis of those critiques. This has the potential to make this a much stronger paper because by providing a balanced view it can simultaneously present new data and pre-empt the problem of citation divergence (whereby some research groups cite only from select literature and others from a different set, and thus integration of these two literatures becomes compromised and thus detrimental to the overall scientific aims).

5) Hypothesis testing. The hypotheses are not clearly set up from the start, they are not embedded in any relevant theory (more on this below), and they are post hoc in nature. It appears to be a study where a large sample was input into some analyses to see what patterns emerged, and then those patterns were explained after the fact. A better approach would be to structure the lead-up so that it is clear what would be expected under what circumstances (phylogenetic, environmental, life history, etc.) and then test those hypotheses. In addition to modifying the setup to create a more rigorous set of well-supported expectations, the question be reworded to be more specific about chimpanzees and humans with respect to what traits should be linked with longevity. There is a good start to this discussion in the Introduction, and that could form the basis for a revised setup to the problem. There appear to be some expectations (Discussion, second paragraph), and these are discussed in an interesting way later, but the manuscript would be much stronger if these were clearly defined at the start and then systematically tested.

6) Theoretical framing. All predictions or interpretations are entirely based on previous empirical results, rather than derived from first principles. While this might be the norm in psychology, eLife is a biology journal, and evolutionary theory provides us with a framework from which to derive predictions about biological traits such as longevity and stable behavioral variation (that is presumably mediated by stable variation in neurobiology, metabolism, etc.). Thus, when examining links between consistent behavioral differences and longevity, an evolutionary biologist immediately thinks of life-history strategy as a possible underlying cause of both. Life history theory is especially relevant here as all sources of extrinsic mortality have been removed in this captive sample, and the chimpanzees are presumably dying because of intrinsic mortality; an individual's degree of investment in maintenance and repair, i.e. the things that reduce intrinsic mortality, is of course shaped by their life-history strategy, as are, arguably, consistent behavioral differences between individuals. Indeed, there is a large literature examining links between personality, longevity, and measures to reduce intrinsic mortality such as investment in immune function, in other animals called the 'pace-of-life syndrome' (e.g. Reale et al., 2010; Smith and Blumstein, 2008). This literature provides the kind of a priori predictions the current manuscript is lacking, such as certain personality dimensions being linked to longevity due to being part of a faster or slower life-history strategy. For example, achieving high dominance requires substantial investment in physical strength and muscle, associated with high testosterone levels and risk-taking behavior, which trade-off with investment in immune function etc. and are thus associated with a faster life-history strategy and higher extrinsic and intrinsic mortality (as is typical for most primate males compared to females); hence one would predict dominance to be negatively associated with longevity not positively, similar to the general sex difference in dominance and longevity (see e.g. Kruger and Nesse, 2005, Human Nature An evolutionary life history framework for understanding sex differences in human mortality rates). Conversely, an association between agreeableness and longevity is exactly what you would predict if agreeable individuals invest less in behavioral dominance and more in cooperation, which would be associated with a slower life-history strategy. Throughout the paper the authors speculate about possible causal links between personality traits and longevity (through 'controlling health', or 'health benefits conferred by intelligence'), which will need to be re-examined in light of theory that predicts both to be explained by a third variable (life-history strategy). For relevant arguments in humans, see e.g. several articles by Pepper and Nettle (2014 Human Nature, 2014 Applied evolutionary anthropology, 2017 Behavioral and Brain Sciences) that argue how a life history theory perspective can help explain variation in health behavior and thus SES-gradients in health. Of course, there are other evolutionary theories of personality (see e.g. Buss, 2009 How can evolutionary psychology successfully explain personality and individual differences?) but life-history theory provides the most direct link to longevity.

7) Use of a captive sample. The authors make strong claims about evolution and natural selection, yet test animals in a (non-natural) captive situation. As a consequence, selection pressures that have shaped evolution are being cancelled out and the effects of personality on longevity that the authors report are not informative for understanding the evolution of chimpanzees. For example, in this study there is no effect of extraversion (or boldness) on longevity, but it is obvious that such a trait may have an effect with actual predators around. Similarly, in the wild, were food is a limiting factor, dominance (which may not actually be a personality trait as it is not consistent if new opportunities arise) will have a major effect. As another example, the authors simultaneously argue that "observed effects in captive chimpanzees will be more comparable to effects found in similar human studies than would effects observed in wild chimpanzees". However, they go on to then offer an evolutionary explanation that seeks to describe their results in terms of ancestral behavior and the environment of selection (Abstract): "natural selection, after the divergence of hominins, favored the protective effects of high quality social bonds for males and exploratory behavior for females." The relationships they observe in fact seem equally explicable as factors that promote longevity specifically in captive situations. Although the authors do well to note this possibility, they appear to dismiss it in favor of their preferred alternative. Where they do find a lack of concordance with their expectations, the authors quickly engage in a useful discussion about the effects of captivity, but seem to discard this argument when they discuss their positive results. These alternative explanations must be carefully explored, and test implications set out (with substantive literature support) in order to seriously treat (and not just dismiss) the very real possibility that the observed pattern has no bearing at all on natural selection. One reviewer note that the captive sample can have its advantages, and these can be stressed. For example, the captive sample eliminates most extrinsic mortality, so that what remains is essentially how much individuals invest in maintenance and repair, which could well be related to their personality through life-history strategy (slow strategy = invest more = lower intrinsic mortality = 'nicer' personality). This still suffers from the problem that extrinsic mortality matters a lot in wild populations (and thus natural selection), but acknowledging these shortcomings, this study could be a good test of the idea that life-history strategy has consequences for both behavioral style and intrinsic mortality risk.

8) Analytical approach.a) It appears that the power analyses were conducted on the entire dataset (rather than pilot data), and thus constitute 'observed power'; this is unfortunately completely flawed and unnecessary. As demonstrated by Hoenig and Heisey (2001, Am Stat The abuse of power: The pervasive fallacy of power calculations for data analysis) there is nothing to be gained from such a retrospective power analysis, and indeed they may be entirely misleading. Power analysis only makes sense prospectively, using pilot data, and indeed eLife's transparent reporting form asks 'whether an appropriate sample size was computed when the study was being designed'. As this was not the case here, the power analyses should be removed.

b) I have to disagree with the dismissal of an age-confound on agreeableness based on a non-significant P-value of 0.077. P value thresholds are arbitrary conventions, and when there is an age pattern – the correlation of -0.08 is about as strong as the one for neuroticism at 0.09 – it should be controlled for, especially when one of the main findings is about an association between agreeableness and longevity. And while I appreciate that the authors fit several possible age models to the personality dimensions that did have significant correlations with age, I also disagree with selecting a single best model based on AICc (as the authors know, information criteria are better used to weight models and average predictions rather than select a single model [unless it receives all the weight]). Furthermore, polynomials are not ideal, and I would suggest using a spline term for age (using GAM) instead, which obviates the need to compare linear vs. non-linear fits. Incidentally, the fact that the best fit for the age effect on most personality dimensions was non-linear refutes the use of simple correlations. I would thus strongly suggest using GAM residuals for each personality dimension. As an aside, I was confused as to why date of birth rather than biological age was used?

9) Data accessibility. Two of the reviewers also expressed concern that the entire dataset may not be de-identified and available in published form (by assigning an ID to individual chimpanzees, and facilities). The editorial staff also had a discussion about the submission's compliance with the open-access policy of eLife. It is not clear to what extent the data can be precisely reproduced, given a lack of access to the full dataset that was used in the analysis. For example, how can personality links with lifespan be replicated without mortality data for the same individuals for which the personality attributes are known? Knowing social relationships, group size, etc. are also important because certain personality traits may be much more important in some specific settings than others.