This article documents my current thoughts on how to make the most out of my experiment with earning to give. It draws together a number of texts by other authors that have influenced my thinking and adds some more ideas of my own for a bundle of heuristics that I currently use to make donation decisions. I hope the article will make my thinking easier to critique, will help people make prioritization decisions, will inspire further research into the phenomena that puzzle me, and will allow people to link the right books or papers to me.

Content note: This article is written for an audience of people familiar with cause prioritization and other important effective altruism concepts and who have considered arguments for the importance of long-term impact. To others it may seem confusing unless they get their bearings first through reading such books as Doing Good Better and Superintelligence.

I’m aware of many arguments for the great importance of long-term strategic planning, but when “long-term” refers to millenia and more, it becomes unintuitive that we should be able to have an influence on it today with any but miniscule probability. I’m hoping to avoid having to base my work on Pascal’s Mugging–type of scenarios where enormous upsides are used to compensate their miniscule probability for a still-high expected value. So I need to assess when, in what cases, or for how long we can influence the long term significantly and with macroscopic probabilities.

Then I need heuristics to be able to compare, very tentatively, interventions that all can’t rely on feedback cycles or only on feedback cycles around unreliable proxy measures.

Finally, I would like to be able to weigh assured short-term impact (in the sense of “a few centuries at best”) against this long-term impact and develop better heuristics for understanding today which strategies are more likely or unlikely to have long-term impact.

Reservations The heuristics I’m using are not yet part of some larger model that would make it apparent when there are gaps in the reasoning. So there probably are huge gaps in the reasoning. In particular, I’ve come up with or heard about the heuristics during a time when I got increasingly excited about Wild Animal Suffering Research, so I may have inadvertently selected them for relevance to the intervention or even for being favorable to it. I’m completely ignoring considerations related to infinity, the simulation hypothesis, and faster-than-light communication or travel for now. Especially the simulation hypothesis may be an argument why even slight risk aversion or a slight quantization of probability can lead to a greater focus on the short term.

The importance, neglectedness, tractability (INT) framework (or scale, crowdedness, solvability) provides good guidance for comparing between problems (also referred to as causes). It can be augmented with heuristics that make it easier to assess the three factors.

One addition is a heuristic or consideration for assessing scale (and plausibly tractability) that occurred to me and that seems to be very important provided that we are able to make our influence felt over sufficiently long timescales.

I care primarily about reducing suffering (in the broad sense of suffering-focused ethics not specifically negative utilitarianism), so I want my trajectories to point downward. Suffering of humans from poverty-related diseases has long been going down. If I were to get active in that area and actually succeed quite incredibly, the result may look like the chart below.

Here the sum of the two darker areas is the counterfactual suffering over time and the lighter area is the difference that I’ve made compared to the counterfactual without me.

But consider the case where I pick out something that I think has an increasing trajectory:

When we zoom out, this starts to look really big:

In this simplified model, the impact of interventions that can and conceivably will solve the issue they’re addressing can be curtailed by intersecting with y = 0 if our influence is sufficiently durable.

This is a heuristic of scale and tractability. It is straightforward that the scale of a problem that is nigh infinite over all time is greater than that of a problem that is finite.

In practice, we will probably see an asymptotic approach toward the x axis as the cost-effectiveness of more work in the area drops further and further. This is where tractability comes in. As you approach zero disutility within some reference class, there is usually some decreasing marginal cost-effectiveness as the remaining sources of disutility are increasingly tenacious ones. If our influence is still felt at that point, it’ll rapidly become less important, something that does not happen (and maybe even the opposite) in the case of interventions that try to dampen the exacerbation of a problem. So if I have the chance to affect some trajectory by 1° in a suffering-reducing direction, then I’d rather affect an upward trajectory than a downward one, one that afterwards will at most become constant.

Some EAs have complained that they set out to make things better and now they’re just trying to prevent them from getting much worse. But perhaps that’s just the more impactful strategy.

I’ve often seen urgency considered, but not as part of the INT framework. I think it is a heuristic of tractability. When there is an event that is important to prevent, and the event is of such a nature that preventing it is all-important but there’s nothing left to do if and once it happens, then all the leverage we have over the value of the whole future is condensed into the time leading up to the potential event.

Survival analysis is relevant to this problem, because there are two cases, one where the event happens and then nothing (e.g., some suffering-intense singleton emerges and continues throughout the future – a suffering risk or s-risk) and one where the event doesn’t happen but may, as a result, still happen just a little later.

There is some survival function that describes our chances of survival beyond some point in time. It may be some form of exponential decay function. These have a mean, the expected survival duration, which may serve as a better point for comparisons of urgency than the earliest catastrophic event.

If we wait with preventing catastrophes for so long that at the time when we do start to work on them the duration of the future that we can expect to influence is longer than the duration of the future we can expect to survive for, then it seems to me that we waited for too long. Conversely, if the expected end of our influence is earlier than our demise, it might be better to first focus on either more urgent causes or causes that don’t have such a strong event character. But first we’ll need to learn more about the likely shape of the decay function of our influence.

Further Research It would also be interesting to investigate whether we can convert the survival function for an s-risk-type of catastrophe into an expected suffering distribution, which we could then compare against expected suffering distributions from suffering risks that don’t have event character. The process may be akin to multiplying the failure rate with the suffering in the case of the catastrophe, but since the failure rate is not a probability distribution, I don’t think it’s quite that easy.

Some of my considerations for comparing interventions are less well established than the scale, tractability, neglectedness framework for problems. Michael Dickens proposed a tentative framework for assessing the effectiveness of interventions, but it is somewhat dependent on the presence of feedback loops. Apart from that, however, it makes the important contribution of considering the strength of the evidence in addition to what the evidence says about the expected marginal impact.

I will propose a few heuristics other than historical impact for evaluating interventions. In each case it would be valuable to not only count the heuristic as satisfied or not but also multiply in once confidence with that verdict. Michael Dickens, for example, generated likely background distributions of intervention cost-effectiveness to adjust his confidence in any particular estimate accordingly.

Confusingly, I’m listing feedback loops as one of the heuristics. But I don’t mean it in the sense that we should necessarily draw on historical evidence when evaluating the cost-effectiveness of an intervention (we should do so whenever possible of course) but that the presence of feedback loops that are short enough will be very valuable for the people running the intervention. So if there’s a chance that they can draw on them, it’s a big plus for the intervention enabling iterative improvements and reduced risks of updating around aimlessly on mere noise.

We can almost never measure the outcomes we care about most directly but almost always have to make do with proxies, so the reliability of these proxies is an important consideration.

There are two failure modes here: the desired change may be reversed at some point in the future or it may have eventually happened anyway. For example, vegan and vegetarian societies have not been stable in the past or valuable research may get done in academia only a little later than at an EA-funded institute. Another failure mode for many interventions is that some form of singleton may take control of all of the future so that we can influence the future only by proxy through influencing what singleton emerges.

I can see three possible futures at the moment: (1) the emergence of a singleton, a wildcard for now, (2) complete annihilation of sentient or just intelligent life, or (3) space colonization. I’ll mention in the following why I’m conflating some other scenarios into these three. (This is heavily informed by Lukas Gloor’s “Cause Prioritization for Downside-Focused Value Systems.”)

I’ve heard opinions that implied that a permanent singleton is possible and others that leaned toward thinking that value preservation under self-improvement is so hard that value drift is inevitable. Perhaps this path may enable us to create a wonderful populated utopia or attempts at it might lead to “near misses [that] end particularly badly.” Improving our strategy for preparing for this future seems very valuable to me.

A seeming singleton that turns out unstable after a while may be as deleterious for our ability to influence the future via other means than the singleton itself as an actual (permanent) singleton. Artificial general intelligence (AGI) may turn any sufficiently technologically advanced state that is not a singleton into a distractor state, in which case many of my thoughts in this text are moot. So this case is crucial to consider.

In the cases of human extinction or extinction of all life, it would be interesting to estimate the expected time it’ll take for another at least similarly intelligent species to emerge. Turchin and Denkenberger have attempted this, yielding a result of something to the order of 100 million years. Such a long delay may significantly reduce the maximal disvalue (and also value, for less suffering-focussed value systems) of space colonization because resources will be more thinly spread (time-consuming to reach) throughout the greatly expanded universe and some other factors. However, space colonization may still happen relatively quickly if there are more species within our reachable universe who are also just some millenia away from starting to colonize space.

But Lukas Gloor thinks “that large-scale catastrophes related to biorisk or nuclear war are quite likely (~80–90%) to merely delay space colonization in expectation,” with AGI misalignment and nanotechnology posing risks of a different category. So a wide range of seeming extinction risks may allow our civilization to recover eventually, at which point the future will still hold these three options.

Even if this repeats many times, it is likely that eventually one of these civilizations will either go completely extinct, form a singleton, or colonize space. The third option would eliminate many classes of extinction risks, the ones that are only as severe as they are because we’re physically clustered on only one planet. Extinction risks that only depend on communication will remain for much longer.

Risks from artificial intelligence will probably only depend on communication so that it will have to turn out that artificial intelligence can be controlled permanently for space colonization to happen. This surely makes this scenario less likely than the previous two. (Or more precisely, though such precise language may imply a level of confidence I don’t have, less than one third of the probability mass of the future may fall on this scenario.) Perhaps, though, we have a greater chance to influence the outcome or influencing it may require different skill sets than influencing artificial intelligence.

But while humans may look out for one another, we have a worse track when it comes to those who are slightly outside our culture (or involved in it only in a unidirectional way where they can’t make demands of their own in “trades” with them). Michael Della-Iacovo considers that insects may be spread to other planets (or human habitats in space) as a food source or even by accident. Eventually, farmed animals may follow, and they may return to being wild animals even when animal farming becomes obsolete due to such technologies as cultured meat. People may even spread wild animals for aesthetic reasons. Finally, the expected capacity for disutility of bacteria may be small, but large numbers of them play vital roles in some proposed strategies for terraforming planets.

So the graphs in “Upward Trajectories” are inaccurate in the important way that suffering is likely to expand spherically as some cubic polynomial of the radius from our solar system – not linearly.

If we want to influence a future of space colonization positively, then it is all-important to make sure that our influence survives to the start of the space colonization era. (I will qualify this further in “Reaching the Next Solar System” below.) Then, longer survival of the influence becomes rapidly less important: Communities will likely cluster because of the lightspeed limit on communication latency, and if one of these communities loses our influence or would’ve reaped its benefits even without us, it’ll carry this disvalue or opportunity cost outward only into an increasingly narrow cone of uncolonized space.

I specify that I mean strategic robustness because I’ll mention something that I’ll refer to as ethical robustness later, but this strategic robustness is just what others mean when they just say “robustness.” Strategic robustness unfortunately overlaps with what I called durability, but I try to disentangle it by using robust to refer to strategies that are beneficial across many different possible futures while I use durable to refer to strategies that have a chance to survive until we colonize space, if it should come to that. Insofar as the second is included in the first, it is a small (but important) part of the concept, so I hope not to double-count too much.

In The Bulk of the Impact Iceberg, I argue that any interventions that observably produce the impact we care about rest on a foundation of endless amounts of preparatory work that is often forgotten by the time the impact gets generated by the final brick intervention. Some of this preparatory work may not get done automatically. This suggests that highly effective final brick interventions are few compared to equally or more effective preparatory interventions.

A particularly interesting question is here whether there are heuristics for determining where the most important bottlenecks are in these trees of preparatory work because that is where we should invest most heavily. Studying advertisement may be helpful for noticing and testing such heuristics. (More on that in the abovelinked article.)

I particularly highlight (and use as one such heuristic for now) research as a highly effective interventions, because, if it is done right and studies something that is sufficiently plausible, it is valuable whether it is successful or seemingly fails.

Feedback loops generate valuable information to improve an intervention incrementally. In this context of robustness, however, I’m referring to information that is generated that doesn’t only benefit the very intervention that generates it but is more widely beneficial. People working in the area of human rights in North Korea, for example, may not be working on the most large-scale problem there is, and tractability and neglectedness may not make up for that, but they may gain skills in influencing politics and gain insights into how to avoid similar nigh-singletons as North Korea in the future.

Anything that we have already determined to be instrumentally convergent – i.e. something that a wide range of agents are likely to do no matter their ultimate goals – is convergent for reasons of its robustness. So all else equal, an intervention toward a convergent goal is a robust intervention. Gathering knowledge, intelligence, and power are examples of such robust capacity building.

Things that destroy option value are bad (all else equal):

If you work on some intervention for a decade and then find that it is not the best investment of your time, it’s best if the work still helped you build transferable skills and contacts, which you’ll continue to profit from. If you never defect against other value systems even when it would be useful to you, you don’t lose the option to continue cooperate with them or to intensify your cooperation. If you remain neutral or low-profile, you retain the ability to cooperate with a wide range of agents and can avoid controversy and opposition. If you avoid making powerful enemies you can minimize risks to yourself or your organization. The individual or the organization are often points of particular fragility in an intervention so long as relatively few people are working on it or they are fairly concentrated. (The failure of an organization, be it for completely circumstantial reasons, may also discourage others from founding another similar organization.)

Relatedly, something that I like to call “control” is useful because it implies higher option value: If you can change course quickly and at a low cost you have more control than if you could do the same but at a somewhat higher cost or more slowly.

Finally, robustness is nil if the risk of very bad counterfactually significant outcomes is too high.

Valuable work that is funded by money or time that would’ve otherwise benefited almost as valuable work is not terribly valuable. Resource constraints are rarely so clear cut, so we may never know whether it was great, good, eh, or bad for Rob Wiblin to switch to 80,000 Hours, but if you can achieve the same thing whether you’re funded by EAs or by a VC, then it’ll likely make a big difference.

Even a problem whose solution is tractable may have an intervention addressing it that is hard to support, e.g., because you’d have to have a very particular skill set or the intervention is so far only a hypothetical one.

No two people have completely identical values, and often values are significantly different. Battling this out in a battle of force or wits has rather flat expected utility dependent on the actors relative strengths while cooperation lets them find Pareto-efficient points:

Depending on the shape of the frontier, this can be an extremely important consideration even for a risk neutral agent with a single value system: To use the example from the article linked above, a special deep ecologist may care equally about 8e6 species totalling 2e18 individuals or the same number of species totalling radically more individuals. If most of these individuals suffer badly enough throughout their lives, a suffering reducer may greatly prefer the relatively smaller number. A trade would be very valuable for the suffering reducer – if they’re roughly equally powerful, then many times more than a battle in expectation. But it is also very decisive for me because I empathize with many value systems, so I want to do the things that are good or neutral for all of them – what I refer to as ethical robustness. Even if the frontier is such that the gains from trade are minor, a zero-sum game may look more like a battle than a race, so that time is irreversibly lost, time that could be used to avert suffering risks or preserve important suffering-reducing knowledge across existential catastrophes. More speculatively, insofar as our decisions can be correlated with those of sufficiently similar agents who yet have different moral goals, we can determine the results of the different instantiations of the decision algorithm that we share with them and thus get them to cooperate with our moral system too.

Agents with strong commitments to cooperation will still face hurdles to achieving that cooperation because they may not be able to communicate efficiently. They can apply heuristics such as splitting their investments (see the “Coordination Issues” section in the linked article) into shared projects equally or focussing on the issues that they know few others can focus on, so capitalizing on their comparative advantage.

Altruists at government institutions or many foundations may face greater requirements to make strong cases for their investment decisions that they can point to later when something goes wrong or a project fails to avoid being blamed personally. Private donors or the Open Philanthropy Project can thus cooperate with the former group by supporting the types of projects that the former group can’t support.

The first space colonization mission is about to be launched but the team of meteorologists is split over whether the weather is suitable for the rocket launch. Eventually, the highest-ranking meteorologist decides that if their colleagues can’t agree then the weather situation is too uncertain to permit the launch.

A week later, the rocket launches, and it sets a precedent for others. Innovation accelerates, prices drop, and eventually humans expand beyond the solar system far out into the Milky Way launching new missions in all directions by the minute. And all the while they take with them bacteria, insects, and many larger animals, may r-strategists among them. Unbeknownst to everyone, though, the launch would’ve been successful even the first time.

What influence does the one-week delay have one millennium later? Intuitively, I would think that is has close to no impact, but a very simple model is not enough to confirm that impression.

The chart assumes that I’m the meteorologist and that I’ve turned a world whose suffering would’ve looked like the blue line, t³, into the red-line world, (t-1)³. The yellow line is the difference in suffering between the worlds: 3t² - 3t + 1. That’s a ton, and rather than diminishing over time like my intuition intuited, it increases rapidly.

What decreases rapidly, however, is the size of the difference relative to the suffering that increases even more rapidly – the green dotted line: (3t² - 3t + 1) / (t - 1)³. So is it just this morally unimportant relative difference that caused my intuition to view our influence as something like an exponential decay function when really it will increase throughout the next millennia? Dan Ariely certainly thinks that we erroneously reason about money in relative rather than absolute terms, fretting about cents and dollars when buying sponges but lightly throwing around thousands of dollars when renovating a house or speculating on cryptocurrencies.

Or maybe the intuition is informed by cyclical developments.

Some overwhelming outside force may limit our potential absolutely at some point in time and force us back, and from that point onward, the suffering trajectory would only be conditional on the overwhelming force rather than our influence from the past.

Seasons, winter in particular, are an obvious example of cyclical forces like that, but in the context of space colonization, I can’t think of reason for it just yet – especially once several solar systems have been colonized with decades of communication delay between them and probably no exchange in commodities given how much longer they would have to travel.

But what about correlated forces rather than mysterious overwhelming outside forces?

Perhaps that week of delay saw one week of typical progress in charting space, searching for exoplanets, and advancing technology, so that the delay created one more week of “progress vacuum,” which eventually backfired by speeding up progress beyond the slope of the trajectory of the counterfactual world.

But the graph above is probably an exaggeration. When you reduce the supply by slowing some production process down, the demand reacts to that in complex ways due to various threshold effects, and the resulting cumulative elasticity factor is just that, a factor, a linear polynomial. If our influence is a quadratic polynomial, as surmised in the first diagram, then multiplying it with a small linear polynomial is not going to have a significant negative impact in the long term.

But might we face more mighty thresholds?

Now that’s one mighty threshold right there! Once we’re spread throughout space, it becomes harder to think of thresholds like that because it has to have an expected impact that is at least commensurate to a quadratic polynomial of time or radius. But so long as the radius is constant and we’re still down to earth, a conference may be all that’s needed: a research group may present some seminal paper at a conference – delay them by one week, and they’ll hurry up and have a typo more in the presentation, but they will still present their results to the world at the same moment.

In conclusion, I think, based on just these rough considerations, that the cyclical model will continue to lose relevance but that the correlated speed-up (or supply/demand elasticity) model and the threshold model – the first probably just a smoothed out version of the second – will continue to be highly relevant for as long as we’re still in one solar system or even on one planet.

Intentionally delaying things that others care about is a bit uncooperative. The intervention that I’m most excited about at the moment, making space colonization more humane for nonhumans, would instead aim to reduce the suffering footprint space colonization without interfering with its rate, so for example 0.9t³ instead of (t-1)³. The difference to the counterfactual is now a cubic polynomial itself!

For short time scales, the delay approach is still ahead:

But then the cubic polynomial of the improvement approach of course quickly catches up:

Now how can my intuition that impact decays still be true when we’re dealing with a polynomial that is yet another degree higher?

Thresholds would have to look different than papers presented at conferences to limit out impact in this scenario. A progress vacuum is also not as straightforward. But if we put in a lot of work to advance a technology for making space habitable that minimizes suffering and so make it the go-to technology by the time the exodus starts (see “Humane Space Colonization” below), we’re establishing it through some degree of path dependence. If it turns out that it wasn’t the most efficient technology, the path dependence may not be enough to lock it in permanently – just as it is imaginable that Colemak might eventually replace Qwerty. Or fundamental assumptions of the technology we locked in that way may cease to apply – more comparable to how Neuralink might eventually replace Qwerty. But in either case, our impact may only degrade to delay levels, not to zero.

But if our impact still decays and if the period until the cubic polynomial overtakes the quadratic one is suitable in length, then we might still be better off delaying. Perhaps there’s potential for a moral trade here: We’ll help you with your technological progress but in return you have to commit to long-term funding of our work to make it more humane.

Further Research What are the strongest risks to our impact with the improvement approach? Can we build some sort of technological obsolescence model off the idea that assumptions follow a tree structure where technologies closer to the root make fewer assumptions – they are harder to invent but there are fewer ways to make them obsolete – whereas the opposite is true for inventions closer to the leaves, and that in the shape of an exponential relationship, thus introducing a factor that can perhaps, in some way I haven’t quite fleshed out, still explain my decay intuition? When r-strategists reproduce they might "multiply" exponentially within the Malthusian bounds. If we can hope to affect little of the future, this may be highly relevant. Even if not, might we be able exert greater force on the polynomial than a delay has by influencing the factor of the exponent of such exponential but bounded growth?

Which of these models might best fit any observations that we can already make? To get at least some intuition for what we’re dealing with, I want to draw comparisons to some better-known phenomena.

Factual Counterfactual People Plausibly more ephemeral (than concepts), because (1) they still die. Plausibly more durable, because (1) their values may drift, talents and passions change, etc. at a higher rate than they can promote them to others. Plausibly more ephemeral, because (1) the usual concerns with replaceability. Companies Plausibly more ephemeral, because: (1) they fail, probably in most cases, for a host of other reasons than that their core idea becomes obsolete and (2) because they are highly concrete and thus fragile. Plausibly more durable, because: (1) they can adapt (change their nature) to survive – the idea is not their essence as it is for concepts. Plausibly more durable, because: (1) they are unlikely to would have been founded anyway, under the same name, if they hadn’t been founded when they were. Languages Plausibly more durable, because: (1) they have higher path dependence than most concepts. Plausibly more durable, because: (1) they are unlikely to would have developed the same way anyway if they hadn’t developed when they did. Cities Plausibly more durable, because: (1) they have higher path dependence than most concepts and (2) they are often continuously nourished by something geographic, such as a river, which is probably unusually permanent. Plausible more ephemeral, because: (1) at least some places are so well-suited for cities that if one hadn’t been founded there when it was, it would’ve been founded there little later.

A human with stable values can work for highly effective short-term-focused charities, advocate for them, or earn-to-give for them for 50 years, which may serve as a lower bound for the durability of the influence we can plausibly have.

There are few companies that survived more than millenium (but probably wouldn’t have if it didn’t inhabit a small, stable geographic niche), and hoping to create one that scales and yet survives even a century is probably unreasonable, since large companies are not very old. A large company may be better positioned to get old than a small one, but there’s probably not enough space for enough large companies so that they can have the same outliers that there are among small companies.

Languages and cities still exert influences several millennia later, but because of the high probability that the oldest cities would’ve been founded anyway in short intervals because of their geography, languages are probably the stronger example. But they only have “extrapolatory power” for very path dependency–causing concepts.

Language may have emerged some 50–100 thousand years ago, but the proto-languages that can still be somewhat reconstructed were spoken a mere 10,000 or possibly 18,000 years ago.

Further Research I haven’t found evidence for whether linguistic reconstruction errs more on the side of reconstructing noise – that a proto-language could’ve had many different forms with no systematic impact on today’s languages, so that attempts at its reconstruction yield a wide range of different results – or on the side of silence – that the reconstruction is not possible even though today’s languages would’ve been different had the proto-language been different. It may be interesting to weigh the factors that indicate either direction.

So we can perhaps hope for our influence to last between half a century and a couple thousand years.

If we want our influence to reach other solar systems – and we travel there in a waking state – we have to travel these, let’s say, 10 light years at the 100th to 1000th part of the speed of light. That’s some 30–300 times the 356,040 km/h that Helios 2 reached, probably the fastest human-made object in space to date. That doesn’t seem unattainable.

Furthermore, in a sleeping or suspended (in the case of emulations) state the distance doesn’t matter. Such missions will probably be expensive, so they’ll need to be funded by many stakeholders that will have particular interests and goals, such as harnessing the energy of another sun. They’re more likely to fund a mission with suspended passengers who can’t change the goals of the mission when the stakeholders can no longer intervene.

Unfortunately, that doesn’t necessarily mean that the new ships will be launched soon and none of our influence will be lost during the travel. Stakeholders will aim to make a good trade-off between an earlier launch and higher speeds, because waiting for technological innovation in, say, propulsion systems may be worth it and result in an earlier arrival. So the launch may be delayed, and if earlier enterprises get their predictions wrong and innovation happens faster than they thought, then their ships may even be overtaken by later, more value-drifted ships.

The jump from colonizing our solar system (where communication is feasible) and colonizing others is huge: The diameter of our solar system is in the area of 0.001 light years – around 9 light hours – while the closest solar system, around the star Epsilon Eridani, is 10.5 light years away and another one, around the star Gliese 876, full 15.3 light years. (Infeasibly far for light-speed communication with our solar system.) So the different levels of feasibility of communication mean that a range of existential risks will diminish much earlier than any reduction of risk from value drift.

Maybe I’ll find the time to create a quantitative model to trade off short-term-focused and long-term-focused activities in view of the scenario where space colonization either happens before any civilizational collapse occurs or where our influence survives the collapse or collapses.

Guesstimate has the added benefit that the limit to 5,000 samples introduces a quantization that, incidentally, protects against Pascal’s Mugging type of calculations unless you reload very often. That will make the conclusions we can draw from it more intuitive or intuitively correct.

The model should take at least these factors into account:

Our colonization of space can probably be modeled as a spherical expansion at some fraction of the speed of light. It might take into account that faster-than-light travel might be discovered. It should assume small fractions or otherwise take relativistic effects into account. It should take the expansion of space into account, which will put a limit on the maximum region of space that can be colonized – less than the Hubble volume. It should consider the number and perhaps the differences in density of solar systems Since a large fraction of our time will be spent in transit, it may also be important to investigate whether the transit would likely be spent in a suspended or sleep state that should preserve all properties of the civilization or in an active state where the civilization continues to evolve. If we assume that there’s no faster-than-light communication, then civilizations will probably cluster or otherwise have little effect on each other. Solar systems may be a sensible choice for the expanse of such clusters because they also have an energy source. Whole brain emulation seems to me (mostly from considerations in Superintelligence) more quickly achievable than large-scale space colonization, so between already-colonized regions of space, communication and travel will become similar concepts. Emulation may also become possible through training models of yourself with Neuralink. It should include some sort of rate of decay of influence, perhaps modeled as a risk per time, resulting into a decay function such as exponential decay. The following “risks” should contribute to it, but separately as I will argue below: Decay due to destruction of the civilizational cluster. Crucial so long as we’re huddled together on earth, afterwards probably limited to planets more than to whole civilizational clusters. Decay due to random “value” drift, but not limited to values Likely to be relevant for whole civilizational clusters, so probably close to impossible to overcome for us today. Decay (of our influence) due to independent, counterfactual discovery of our contribution. Dito.

I think it makes more sense for me to put effort into prioritizing between different broad “classes of interventions” than individual charities, because if the charities are at all competent, they’re likely more competent than me. And within each cause area, I should just be able to ask the charities which one I should support, or else they’d fail my cooperativeness criterion. (Except, I may not notice many defections.)

Just this “class of intervention” is something they’re probably somewhat locked in to, either by by-laws, comparative advantage through early specialization, or some feature of the team’s ethical system.

So below I try to make sometimes a bit artificial distinctions and then try to apply my heuristics in the form of pros and cons of the cause area and the class of interventions. When an intervention has more cross-cutting benefits, then this scheme breaks down quickly – moral circle expansion, for example, can be beneficial in how it influences futures without singleton and in how it influences what singleton emerges, but the latter benefits are counted only in the “influencing the singleton” section. I think the usefulness of this section doesn’t go much beyond a mere brainstorming.

And note that it doesn’t make sense to count the pros and cons below because their weights are vastly different or my confidences are vastly different or they’re highly correlated or they’re just questions or some combination of these.

My heuristics don’t help much to evaluate this one until I have more clarity on how to convert a hazard function (failure rate) into an expected suffering distribution. I continue to see immense variance in this whole future scenario because of the wide range of different worlds we might get locked in to.

Pro Con Problem Scale or upward trajectories See WASR below. The right singleton may prevent or limit it. Conversely, the wrong singleton may create enormous suffering in other forms. Neglectedness Still substantial. Tractability Very hard to get right, though Lukas Gloor guesses that the chances are macroscopic. Intervention Durability Maximal, ipso facto. Urgency Plausibly most urgent. In particular, strong AGI may come before space colonization. Robustness Even if it doesn’t, the absence of a singleton (through AGI ) in the face of advanced technology may even generally be a distractor state. In that case, space colonization is unlikely to proceed fast enough to prevent an AGI from quickly controlling all (up to that point) colonized space. (More on that in the section “Ideas for a Model.”) Supportability I know too little about how suffering focussed various organizations are and feel like the space needs mostly people with very specific and rare skill sets. Feedback loops Unlikely or distant at the moment and there don’t seem to be reliable proxies. Robustness If Robin Hanson is to be believed, then futures are possible in which AGI can be controlled (and no singleton emerges) even if the alignment problem remains unsolved.

Pro Con Problem Scale or upward trajectory of problem WAS is likely to explode upon the colonization of space, and perhaps there are interventions to dampen this exacerbation. Tractability Welfare biology may well be at a similar turning point as medicine in the early 19th century or so. Neglectedness So far, I see only two or three small organizations working on the problem. Upward trajectory questionable Space colonization may happen late enough that a singleton (that will likely not care about spreading biological life in the first place) can meanwhile obviate my reasons to suspect great scale. Intervention Foundational WAS Research does research and Animal Ethics encourages it, so they fall within the category of the most tractable type of preparatory work I’ve identified. Counterfactually durable It addresses something that is unlikely to be addressed anyway any time soon. In fact, even their and Animal Ethics’ efforts to get more academics on board are progressing slowly. Factually durable Knowledge, as opposed to values, may survive some existential catastrophes for long enough to be recovered by the new civilization once it’s no longer busy with survival only. Not terribly durable The last pro may be a weak one because even if the knowledge survives catastrophes and societal value drifts, it’ll still take a highly developed civilization to make it likely that it’ll be put to use again. Invites defection It is tempting to defect against Quiverfull-type maximization of life value systems, as evident in my uncharitable and flawed comparison. Conscientious cooperators can probably avoid it. Robustness varies Though research promises some robustness, it could be greater if it weren’t focussed on something as specific as ecology. Practice in statistics, for example, would be more transferable for the researchers than practice in ecology, and the results, negative or positive, could be informative for a wider range of altruists.

Movement building so far – probably highly effective but also fairly well funded – has aimed to grow the movement or grow its capacity – one aiming for greater numbers the other aiming for greater influence, better coordination, fewer mistakes, etc. But if effective altruism is destroyed, this would be a bad outcome according to even more value systems than in the case of existential catastrophes, so it would pay to invest heavily into reducing the probability of a permanent collapse of the movement.

Avenues to achieving this may include actual protection of the people – safety nets have been discussed and tried – and preservation of knowledge and values. Texts may be sealed into “time capsules,” but they need to be the right texts. Instructions that require a high level of technology may be useless for a long time after a catastrophe, and Lukas Gloor also notes that today’s morally persuasive texts were written for a particular audience – us. Read by a different audience in a very different world, they may have different effects from what we hoped for or probably none at all. So such texts may need to be written specifically to be as timeless as possible.

Pro Con Problem Tractability Tractability of EA may be higher in a worse world – the world after an existential catastrophe – because there is more to improve and so also more easily improveable things. Tractability Conversely, the tractability of EA may also be lower in a worse world because many highly effective interventions rely on nonparochial work which depends on technology and infrastructure for communication, travel, and transport. Intervention Durability An intervention – probably research for now – would directly address the question of the durability of EA . Cases for the effectiveness of facilitating EA abound, so I won’t reproduce them here. (The raison d’être of 80,000 Hours, Giving What We Can, Centre for Effective Altruism, and others.) Robustness The generality of EA lends interventions in this space exquisite robustness in terms of fundamentality, instrumental convergence, and option value. Cooperation EAs are probably one of the groups best positioned to understand and adhere to cooperation. Common sense Movement building is considered a high priority, a high prior for estimating the value of movement preservation. Robustness Not really a con, just not a pro, is that the research is probably only as good as the instrumental goal it tries to achieve so that it may be double-counting to list the robustness of research as a separate pro. Feedback loops Small-scale catastrophes may allow for some feedback mechanism, but they’re hopefully few and far in between. Supportability I don’t know of anyone who’s currently working on this.

Pro Con Problem Tractability Comparatively excellent insofar as we can tell from the decent feedback loops. Intervention Robustness Excellent (see above). Instrumental convergence and fundamentality are ipso facto, and I’m also happy with how well organizations like CEA , EAF , and RC have budgeted option value. Feedback loops, control, option value Decent. (But this overlaps with the first point.) Cooperation Seems to be going better than in at least some other spaces. Durability: I don’t know how likely EA is to re-emerge if it gets lost, that is, how often it would be invented anyway, but otherwise it seems very fragile with regard to both existential catastrophes and mundane value drift.

Pro Con Problem Scale Maybe only relevant so long as humans continue to exist in carbon-based form. Intervention Fairly durable Compared to values spreading approaches to antispeciesism, cultured meat seems more promising on this front, but the cons in the same category may be more interesting. Fundamentality Research, again, promises some fundamentality by its nature. Counterfactual of funding The research can be funded through for-profit startups. Cooperation With most value systems at least. Classic utilitarians who bite the Repugnant Conclusion bullet may mourn the decimation of cows farmed for beef, who are said to have net positive lives. Robustness Research in biochemistry is again rather specific research. (See wild animal suffering research above.) Exposed to many existential risks The very advanced technology necessary to culture meat is likely to remain first inaccessible and then prohibitively expensive for a long time after a civilization-destroying event. During those centuries or millennia, the knowledge of how to produce it may get lost. Or, especially when the research is funded through for-profits, the public perception of cultured meat may suffer like that of genetic engineering even without catastrophic events. Counterfactual durability I’ve heard opinions that cultured meat is the sort of thing that would have to be invented necessarily due to the inefficiency of traditional animal farming.

I used to call this section “antispeciesist values spreading” but Jacy Reese draws the line more widely, so I’ll go with his reference class and name for it. I have trouble applying my heuristics here because of how meta the intervention is. Jacy’s article should be more enlightening.

Pro Con Problem Scale All-encompassing. Scale Possibly cut short by singleton. Intervention Robustness Research (as conducted by the Sentience Institute) should be fairly fundamental and widely relevant, the intervention is unlikely to backfire, and it’s so meta that similar considerations apply as for movement building and preservation above. Durability There are probably modes of MCE that are optimized for durability (maybe humane space colonization could fall into this category) but more generally MCE should be susceptible to all the same risks that EA movement building is exposed to.

This is what I’m most excited about at the moment because it promises to be a very strong contender for most effective long-term intervention – but much more research is needed.

The idea is that SpaceX and others may start shooting people into space in a few decades and may start to put them on Mars too. When that time comes, they’ll be looking for technologies that’ll allow people to survive in these environments permanently. They’ll probably have a range of options, and will choose the one that is most feasible, cheap, or expedient. There may even be a path dependence effect whereby the proven safety of this one technology and the advanced state of its development make it hard for other technologies to attract any attention or funding.

This may not be the technology that would’ve minimized animal suffering, though. So in order to increase the chances that the technology that gets used in the end and that perhaps sees some lock-in is the one that we think is most likely to minimize animal suffering, we need to invest into differential technological development such that at the time that the technology is needed by SpaceX and company, the one that is most feasible, cheap, and expedient coincides with the one that minimizes animal suffering.

A social enterprise that aims to achieve this could be bootstrapped on the basis of vertical agriculture, greenhouse agriculture, and zero-waste housing technologies and then use its know-how and general capacity to research low-suffering technologies for making space habitable.

Pro Con Problem Scale or upward trajectories See Wild Animal Suffering Research. Neglectedness No one is working on it specifically as far as I know. Two or three organizations are working on welfare biology in general. Tractability Unknown. Intervention Counterfactual of funding If it can be run as social enterprise, the funding will have exceptionally low counterfactual value. Durability Highly likely to influence space colonization scenarios in the long term. Robustness Acceptable because greenhouse agriculture may also be a way to cut down on insect suffering in the short term already. Urgency High given how much research is probably necessary, but see con. Cooperative Most people are not, in my experience, astronomical waste–type of consistent classic utilitarians, so most people would probably be neutral toward the project and many may even welcome it without being explicit, agenty suffering reducers. Counterfactually durable See Wild Animal Suffering Research. Feedback loops Maybe, thanks to deserts and the like on earth. Robustness Possibly but unlikely relevant for singleton scenarios. Also less robust than the most robust research WASR could do because the corporate research will often have other, overriding priorities. Urgency Plausibly lower than singleton-affecting work if Michael Della-Iacovo is right.

This section is only interesting for those who would like to know why I went through the developments that I went through since 2015 – e.g., because you may find yourself in a similar situation as I did back then. Everyone else can feel free to skip this section.

In 2014, I started out being fan of GiveWell’s and Animal Charity Evaluators’ top charities. (See my previous post of this sort.) But these two organizations have the strength and limitation that you don’t need to trust them (or any subjective judgement calls of theirs) because they aim to draw only on hard, objective evidence that can be published in a review. (When they do need to draw on judgment calls, they make them transparent, so that you can fill in your own instead and see how it changes the result.)

The best evidence comes from trying things and seeing what works – from feedback loops. But in some cases feedback loops can be very long; the feedback can be the absence of a catastrophe, which is often difficult to know without knowledge of the counterfactual without the intervention; or the intervention can be so fundamental that it is easily forgotten by the time the feedback comes in. Highly effective interventions can fall into one or several of these categories. A search for relatively safe and reliable ways of doing good would have to exclude many of them.

But what I care about terminally is to maximize my expected impact in reducing suffering not to minimize the variance of my estimates of it.

The organizations that provide sufficient evidence (either by implementing a well-proven intervention, like AMF, or running trials on their own interventions like GiveDirectly) are few and limited to areas that are fairly easy to study.

Michael Dickens argues that the background distribution of cost-effectiveness over interventions is likely to be log-normal or Pareto. The heavier tail of Pareto distributions (or maybe power-law distributions in general) makes it seem like the more likely choice to me.

If you have a heuristic that can find some of the most cost-effective interventions (such as the scale, tractability, neglectedness one), but you first draw a sample from the search space, you’ll already lose some of the interventions that the heuristic could’ve recognized.

But, what is worse, the sampling is not random but probably correlated with lower scale and lower negletedness (though perhaps not or in some cases even negatively with tractability).

In order to accrue evidence of effectiveness, interventions need to have short feedback cycles. Studies run for a few years or decades, even natural quasi-experiments rely on cycles in the form that events in a plausible reference class must’ve happened already, and recently enough that there’s data on them and that the study retains some external validity for today.

But the future is long, so the interventions that have the greatest impact are ones that realize this impact throughout as long stretches of the future as possible. In order for such a maximally effective intervention to have strong evidence of its effectiveness, it needs to have short feedback cycles and great positive impact far beyond those cycles. That is something that radically fewer interventions are likely to achieve than just having great positive impact in the long term.

Perhaps short-term effectiveness is highly correlated with long-term effectiveness, but I’m not aware of evidence for that. I’d expect the world to change too much for the training benefits from short feedback loops to be valuable in the long term. On the contrary, high confidence into principles that used to warrant it but don’t anymore is anecdotally hard to overcome. If you know of any good reasons to suspect a correlation, please comment.

Assuming a low correlation, what we’ll get from selecting for feedback loops is likely to be interventions that are highly effective in the long term at most at the rate of the background distribution, so we’ve excluded both clearly ineffective and potentially highly effective interventions. There are many more of the first, if we’re right about the background distribution, so we’ve excluded more of them, but if all the interventions span orders of magnitude in cost-effectiveness differences, then some sort of impact-adjusted rate may make up for that.

What I mentioned above in reference to The Bulk of the Impact Iceberg is another important factor that leads to an underinvestment into some types of preparatory work.

Another consideration that I’m highly unsure about is that additional constraints in for-profit investing usually come at a premium, so adding more constraints (short, recent feedback cycles, well-studied by academia, no conflicts of interest, etc.) may also come at a price in impact.

In my post on Expected Utility Auctions I argue that if you have a large set of highly effective interventions of which, say, 1% have strong evidence behind them and get singled out by evaluators for that reason, altruists will all consider this particular 1% but all the remaining 99% will only receive a random allocation of the left-over attention. With so much attention converging on this 1%, it’ll become greatly less neglected than the remaining 99%.

This may be offset by larger funding gaps – i.e. more slowly dropping marginal utility for money – but I don’t know of evidence that this is the case, and where it is the case, it may be due to the very lack of attention I want to address, e.g., because it’s harder to scale for a charity implementing such an intervention because of the lack of attention from potential hires and thus greater difficulty in scaling up. Moreover, if the marginal cost-effectiveness of one intervention drops more slowly than that of another intervention but the second intervention is much more cost-effective on average, it’ll take a while to drop down to the level of the first intervention.

Moreover, individual charities implementing interventions need to have funding gaps sufficient large to warrant the work that goes into evaluating them even this is possible. That also excludes some small organizations, though evaluators are now working on pipelines to enable such organizations to bootstrap their way up.

The higher-quality evidence comes at a monetary price, too, so interventions where the millions or so for the creation of such evidence have already been paid should tend to be fairly well-funded already.

Such evidence also enables grantors that are accountable to many stakeholders to make safe investments, not only GiveWell. Any donor who is not exposed to accountability pressures should use their comparative advantage to support the interventions that governments and some foundations would not be able to support.

This piece benefitted from comments and support by Anami Nguyen, Lukas Gloor, Martin Janser, Naoki Peter, and Michal Pokorný. Thank you.