The data on humanity’s survival time could be subject to survivorship bias. If early Homo sapiens requires a long period of time to develop the intellectual machinery needed to make scientific observations, then such observations could not include short evolutionary histories, regardless of the extinction rate. The amount of information we could derive from a long track record of survival would therefore be limited due to this observation selection effect. Such a track record could indicate a low extinction rate, or be the byproduct of lucky ancestors surviving high extinction rates long enough to beget progeny capable of making scientific observations. One might therefore object that the bounds on the extinction rate we have estimated are too low12,23. Here, we examine and respond to this concern.

Models to quantify potential sample bias

To model observation selection bias, let us assume that after Homo sapiens first arises another step must be reached. This could represent the origin of language, writing, science, or any relevant factor that would transition early humans into the reference class of those capable of making observations (we call this step ‘observerhood’). Let this step be a random variable denoted S, with cumulative distribution function F S (t). As we are examining natural risks, we assume that S and T are independent. The probability that humanity survives long enough to reach observerhood status (via intelligence, language, writing, science, etc) can be found with the following integral:

$$P(T > S)={\int }_{0}^{\infty }\,{f}_{T}(t){F}_{S}(t)dt$$ (1)

where f T (t) = μe−μt, the probability of extinction at time t. We evaluate an adjusted likelihood function \({ {\mathcal L} }^{\ast }(\mu |T > t)\), denoting that we are taking the likelihood of an extinction rate μ given that humanity has survived to time t, and the fact that we are conditioning on the existence of observers such that T > S. This results in the adjusted likelihood function:

$${ {\mathcal L} }^{\ast }(\mu |T > t)=P(T > t|T > S,\mu )$$ (2)

$$=\,\frac{1}{c}{\int }_{t}^{\infty }\,{f}_{T}(s){F}_{S}(s)ds$$ (3)

where c = P(T > S) is a normalising constant. We evaluate a model with four variations for the observerhood step: a model in which observerhood occurs as a single event that has a constant rate over time, a model with an increasing rate over time, a model with multiple steps, and a model where observerhood simply requires a fixed amount of time.

If desired, we could more crisply define this observerhood property as the ability for a species to collect reliable data on its own track record of survival (e.g. via fossil dating) and analyse it. When correcting for observation selection effects, we are simply conditioning on the fact that our species has developed the ability to conduct this analysis. The observerhood property need not invoke consciousness or be the property of a biological species—a machine estimating a parameter would need to account for observer selection bias if its ability to make such estimates were correlated with the parameter in question.

Model 1: Single step, constant rate

Our first model assumes that observerhood has a constant rate of occurrence θ, so that S is exponentially distributed with cumulative distribution function: F S (t) = 1 − e−θt. This model describes a process in which the transition from early humans into observers occurs by chance as a single step. This could represent the hypothesis that hierarchical language emerged in humans as the byproduct of a chance mutation24. With this model, the probability that observers arrive before extinction is P(T > S) = θ(θ + μ)−1. Our likelihood function can be analytically derived:

$${ {\mathcal L} }^{\ast }(\mu |T > t)=(\frac{\theta +\mu }{\theta }){\int }_{t}^{\infty }\,\mu {e}^{-\mu s}(1-{e}^{-\theta s})ds$$ (4)

$$=\,(\frac{\theta +\mu }{\theta }){e}^{-\mu t}-(\frac{\mu }{\theta }){e}^{-(\mu +\theta )t}$$ (5)

Model 2: single step, increasing rate

Our second model similarly assumes that a single step is needed but that the rate of observerhood increases over time. This model could represent increasing population size or population density, which could in turn drive cultural evolution and increase the probability of such a step25. We represent this with a Weibull distribution with cumulative distribution function \({F}_{S}(t)=1-{e}^{-{(\theta t)}^{k}}\) where k > 1 indicates increasing rate over time (when k = 1, this is the same as the exponential in Model 1). We use numerical integration to evaluate the likelihood function.

Model 3: multiple steps, constant rate

Our third model assumes that there are multiple steps that need to occur in a sequence in order to get observers. This could represent more incremental development of tools, culture, or language. We assume that each step is exponentially distributed with rate θ, so that the timing of the final kth step follows an Erlang distribution with cumulative distribution function:

$${F}_{S}(t)=1-\sum _{n=0}^{k-1}\,\frac{1}{n!}{e}^{-\theta t}{(\theta t)}^{n}.$$ (6)

Note that when k = 1, the distribution is the same as the exponential in Model 1. We use numerical integration to evaluate the likelihood function.

Model 4: fixed time requirement

Our final model assumes that it takes a fixed amount of time τ to reach observerhood. This is an extreme model that allows for no chance, but could represent a gradual and deterministic accumulation of traits. The probability that observerhood has been reached before time t is therefore F S (t) = 1 [t>τ] , the characteristic function that takes the value 1 when t > τ and 0 otherwise. The probability that humanity survives past time τ is 1 − F T (τ) = e−μτ. Our likelihood function of μ is:

$${ {\mathcal L} }^{\ast }(\mu |T > t)=\frac{1}{{e}^{-\mu \tau }}{\int }_{t}^{\infty }\,\mu {e}^{-\mu s}{1}_{[s > \tau ]}ds$$ (7)

$$=\,{e}^{-\mu (t-\tau )}.$$ (8)

This likelihood expression can also be derived using the memoryless property of the exponential. It is worth noting that the fixed time model is a limiting case for both the increasing rate model and the multiple steps model. Taking the limit of Model 2 as k → ∞ results in a fixed time model with τ = θ−1. Similarly, Model 3 converges to a fixed time model as the number of steps increases and the expected time of each step decreases (having infinitely many steps in the limit, each of which is infinitely short).

Results of sample bias models

We evaluate the likelihood of extinction rates between 10−8 and 10−2, given a human survival time of 200 kyr and a wide range of different rates at which observers could originate (Fig. 2). The first thing to note about the first three models is that when the observerhood rates are sufficiently rapid, the likelihood function converges to the unbiased version in the previous section. This can be verified by taking limits: for all of the models as θ → ∞ (or τ → 0 in the case of the fixed time model), \({ {\mathcal L} }^{\ast }(\mu |T > t)\to {e}^{-\mu t}\). If observerhood is expected to occur quickly, then we can take a 200 kyr track record of survival at face value and estimate the extinction rate without observation selection bias.

Figure 2 Models of observer selection bias. Surface plots show likelihood for combinations of μ and θ (where k = 3 for Models 2 and 3) or τ in Model 4. Upper righthand plots show how likelihood shifts when θ → 0 in Model 1, and for a variety of k values in Models 2 and 3. For the first three models, the unbiased model is recovered for large θ, and results start to become biased as the expected observerhood time approaches humanity’s track record of survival. However, even as θ → 0, the bias is limited, and the likelihood of rates exceeding 10−4 remains at zero. This is only violated in the final fixed time model, or in models 2 and 3 when k is sufficiently large. Full size image

However, as the observerhood rates decrease to the point where the expected observerhood time approaches an order of magnitude close to 200 kyr, observer selection bias emerges. Rates that were previously ruled out by our track record of survival are assigned higher likelihoods, since a portion of the track record is a necessity for observers (Fig. 2). For example in Model 1, when θ = 2 × 10−4 (corresponding to an expected observerhood time of 20 kyr), the relative likelihood of μ = 6.9 × 10−5 is increased by a factor of 2.3 (from 10−6 to 2.3 × 10−6). To get a likelihood of 10−6 (corresponding to the most conservative upper bound), the rate must be set at 7.3 × 10−5 (see all edited bounds in Table 2). Interestingly though, this effect is limited. Even as observerhood rates slow to the point where expected observerhood time greatly exceeds 200 kyr (for example exceeding 20 billion years), the revised upper bounds remain within a factor of 2 of the original bounds. The stricter the bound, the weaker the potential bias: for example the 10−6 likelihood bound is only changed by a factor of about 1.2 in the limit as θ → 0. Although there would be some sample bias, there is a hard ceiling on how much our track record of survival can be distorted by observation selection effects.

Table 2 Upper bounds of μ with model 1 bias. Full size table

The reason slow rates of observerhood have a limited impact on our estimates is because if the extinction rate were exceptionally high, the lucky humans that do successfully survive to observerhood will have achieved such a status unusually quickly, and therefore will still observe a very short track record of survival. A long track record of survival is therefore still sufficient to rule out high extinction rates paired with low observerhood rates. We can demonstrate this by examining the typical time it takes for lucky survivors to reach observerhood, assuming a high extinction rate and a low observerhood rate. For example, in the single step constant rate model when θ = 10−6 (corresponding to an expected observerhood time of 1 Myr) and μ = 10−3 (corresponding to a typical extinction time of 1000 years), the expected observerhood time conditional on these high extinction rates is 1000 years. A typical observer will thus still have a very short track record of survival. Models with increasing rates or multiple steps exhibit the same property, although the bias is larger depending on parameter k. For both model 2 and 3 with θ = 10−6, μ = 10−3, and k = 2 (parameters normally corresponding to an expected observerhood time of 830 kyr for Model 2 and 2 Myr for model 3), the high extinction rates will still result in a typical observer emerging unusually early and having only about a 2000 year track record of survival. This can be also seen in Fig. 2 where for Models 1, 2, and 3, the likelihood of high extinction rates exceeding 10−4 are still assigned low likelihood regardless of θ.

However, severe observer selection bias can occur in Models 2 and 3 as k becomes larger, shaping the observerhood distribution such that early observerhood is vanishingly unlikely and late observerhood almost guaranteed. In the most extreme case this is represented by the fixed time model, where the probability of observerhood jumps from 0 to 1 when t = τ (the fixed time model is also the limiting case when k → ∞). If that fixed amount of time is long enough (say, exceeding 190 or 195 kyr), a 200 kyr track record of survival is no longer sufficient to rule out extinction rates greater than 10−4. This result occurs as the fixed time model prohibits any possibility of observerhood occurring unusually quickly. Any lineage of Homo sapiens lucky enough to survive long enough to obtain observer status must necessarily have a survival time greater than τ, which means that being an observer with a survival time of τ conveys zero information about the extinction rate.

For numerous reasons, we find the fixed time model to be implausible. Virtually all biological and cultural processes involve some degree of contingency, and there is no fundamental reason to think that gaining the ability to make scientific observations would be any different. To illustrate a comparison, let us consider a world in which the extinction rate is 10−4 (averaging one extinction every 10,000 years), but observerhood status takes a fixed 200 kyr. Under this model, humanity successfully surviving long enough to reach observer status is an event with 1 in 200 million chance. Given observation selection bias, we cannot rule out the possibility of rare events that are required for our observations. But we could ask why a 1 in 200 million chance event could not also include the possibility that modern human observers would emerge unusually rapidly. Language, writing, and modern science are perhaps highly unlikely to develop within ten thousand years of the first modern humans, but it seems exceptionally overconfident to put the odds at fewer than 1 in 200 million.

A similar line of reasoning can be applied to determine whether the increasing rate and multiple step models with high k are reasonable. We test this by asking what parameters would be needed to expect a 200 kyr track record of survival with an extinction rate at our conservative upper bound of μ = 6.9 × 10−5. For the increasing rate model, observerhood is expected after 203 kyr with θ = 10−7 and k = 14 and for the multiple step model, observerhood is expected after 190 kyr with θ = 10−7 and k = 16. Although these models do not assign strictly zero probability to early observerhood times, the probabilities are still vanishingly small. With an increasing rate and these parameters, observerhood has less than a one in a trillion chance of occurring within 10,000 years (3.4 × 10−14), and about 1% chance of occurring within 100,000 years. With multiple steps and these parameters, observerhood has less than one in a trillion chance of occurring within 10,000 years (5.6 × 10−17), and less than a 0.02% chance of occurring within 100,000 years. In a similar fashion to the fixed time model, we feel that these models exhibit unrealistic levels of confidence in late observerhood times.

Although the plausibility of the fixed time (or nearly fixed time) models is hard to test directly, the wide variance in the emergence of modern human behavior across geography offers one source of data that can test their plausibility. The Upper Palaeolithic transition occurred about 45 kya in Europe and Western Asia, marked by the widespread emergence of modern human behaviour25 (e.g. symbolic artwork, geometric blades, ornamentation). But strong evidence exists for the sporadic appearance of this modern human behaviour much earlier in parts of Africa26,27, including evidence of artwork and advanced tools as early as 164 kya28. Although numerous factors could have prevented the Upper Palaeolithic transition from occurring quickly, the fact that some human communities made this transition more than 100 kyr earlier than the rest of humanity indicates that a much earlier development trajectory is not entirely out of the question.

In summary, observer selection effects are unlikely to introduce major bias to our track record of survival as long as we allow for the possibility of early observers. Deceptively long track records of survival can occur if the probability of early observers is exceptionally low, but we find these models implausible. The wide variance in modern human behavior is one source of data that suggests our track record is unlikely to be severely biased. We can also turn to other sources of indirect data to test for observer selection bias.