Transcript

Robert Wiblin: Hi listeners, this is the 80,000 Hours Podcast, where each week we have an unusually in-depth conversation about one of the world’s most pressing problems and how you can use your career to solve it. I’m Rob Wiblin, Director of Research at 80,000 Hours.

Today’s interview with Professor Philip Tetlock was recorded last week at Effective Altruism Global San Francisco.

We last interviewed Tetlock back in November 2017 for episode 15. That’s a great episode which I recommend going back and listening to, as it will give you more context, but doing so isn’t necessary to make sense of today’s conversation.

Philip is the Annenberg University Professor at the University of Pennsylvania, a legendary social scientist, and a personal hero of mine.

He has spent the better part of 40 years collecting forecasts about the future from tens of thousands of people, to try to figure out how accurately people can predict the future, and what sorts of thinking styles allow people to do the best job of it.

He was co-principal investigator of The Good Judgment Project, a many-year study of the feasibility of improving the accuracy of probability judgments of high-stakes, real-world events.

His research has resulted in over 200 articles in peer-reviewed journals and two books: Superforecasting: The Art and Science of Prediction, and Expert Political Judgment: How Good Is It? How Can We Know?.

Why go and interview Philip a second time?

Firstly, in 2017 I only got an hour with him, and I simply had a lot more to ask.

Second, I believe Philip’s work is the sine qua non of being rational and having good judgement, and as a result it’s relevant to everyone no matter what they’re doing.

Making accurate predictions is essential for good decision-making in your own life. For instance, if you can’t predict your probability of success in different career paths, you’re going to find it really hard to choose between them.

And that’s just a big picture example — correctly assessing the probability of things is important everywhere in life on an hourly basis, whether you’re deciding whether to call your bank to try to resolve a problem, to apologise to someone you had a fight with, or what to put in your suitcase for a trip overseas.

Third, we’re especially interested in what impact advanced AI might have on the world.

For that, it’s really useful to know what capabilities AI will have at different times. Needless to say, that’s exceedingly hard, but the lessons from Tetlock’s work give us as good a shot as possible of producing sensible estimates.

Fourth, improving judgement and foresight seem especially important for longtermists like me who want to make the world better not just today but for the thousands of generations who may be yet to come. It can be hard to figure out what things we can change about the world now that will consistently point it in the right direction over hundreds or thousands of years, but improving humanity’s capability to correctly foresee the effect of our actions seems like a great guess for something that will help.

As a result many people believe that this is among the most promising broad interventions to positively shape the long-term future.

Fifth, people who love Philip’s research also tend to love 80,000 Hours. So if you’ve just been tempted to tune in for the first time for this interview, welcome to the party — go check out our website and hopefully you’ll learn a lot of things you’ll find useful.

Sixth and finally, our biggest donor, the Open Philanthropy Project have also funded the creation of a tool that helps you better calibrate your probability estimates, and hopefully thereby make better decisions. We’ve put that on our website, and will link to it from the show notes.

Alright, just before we get to that, as I said, this interview was recorded recently at Effective Altruism Global San Francisco where Philip was a speaker.

Effective Altruism Global is the conference for people interested in using evidence and careful analysis to do as much good as possible. If you enjoy this conversation, maybe you should get yourself along to one of these events. The next big one is in London, the weekend of the 18th to 20th of October.

For antipodeans like me there’s a smaller one coming up in Sydney on the weekend of the 28th and 29th of September.

You can find out more about both of those at eaglobal.org.

Alright, that was a bit of ado there, but without any more of it, here’s Philip Tetlock.

Robert Wiblin: Thanks for returning to the podcast, Philip.

Philip Tetlock: Well, thank you.

Robert Wiblin: So we plan to talk about new results in forecasting research, but first, what are you working on right at the moment and why do you think it’s important work?

Philip Tetlock: At this very moment, I am working on what you might think of as the opposite of forecasting. I’m working on backward reasoning in time as opposed to forward reasoning in time.

Robert Wiblin: What does that look like?

Philip Tetlock: Well, it looks like what people in the research literature call counterfactuals. What would have happened if history had taken a different turn at various points.

Robert Wiblin: So this is the tournament involving Civilization V, is that right?

Philip Tetlock: Well, the reason for starting the research in simulated worlds as opposed to the real world is because historical counterfactuals in the real world are unknowable. Historical counterfactuals are a source of almost endless ideological friction and debate. When we started the forecasting tournaments, there was a huge debate, for example, about the role of the Reagan administration and its tactics for dealing with the Soviet Union and whether Reagan was either bringing us closer to a nuclear war or whether Reagan was actually bringing us closer to world peace. It was very polarizing and people disagreed completely on where things were going and after the fact, when the outcomes are known, everybody claimed to be able to explain what happened. So even though their expectations were very different, everybody wound up in a place where they felt comfortable with their preconceptions. Conservatives felt that Reagan had won the Cold War and liberals felt that the Cold War ended… would have ended pretty much the way it did without Reagan and with the two term Carter presidency and a Mondale follow up.

Robert Wiblin: So in this tournament you’re setting people up in particular situations in Civilization V, this famous computer game, and then I guess changing the scenario a little bit and then getting people to forecast what will happen and seeing how accurately they can forecast what would have happened if the starting conditions had been a little bit different?

Philip Tetlock: That’s right. You’re able to do something in the simulated world you can’t do in the real world. You can go back in time and you can say, “Well, what if something different had to happen at turn 100? How would the various aspects of the world have changed whether you changed President Reagan or you change President Trump or you change whether the magnitude of the recession in 2008, whether Bernanke was leading the Federal Reserve.” You’ve got a long list of things you can change in economics or in politics or in military affairs and people just deeply disagree about these things and they can disagree forever because nobody can go back in your time machine, rerun history and see what would have happened. So the peculiar thing in the real world is how comfortable we are making pretty strong factual claims that turn out on close inspection to be counterfactual. Every time you claim you know whether someone was a good or a bad president or whether someone made a good or bad policy decision, you’re implicitly making claims about how the world would have unfolded in an alternative universe to which you have no empirical access, you have only your imagination.

Robert Wiblin: So do we have any existing research on how good people are at kind of factual reasoning or does the fact that we can’t go down these alternative histories, basically you meant that people haven’t been able to research this?

Philip Tetlock: Well, what we know about counterfactual reasoning in the real world is that it’s very ideologically self-serving. That people pretty much invent counterfactual scenarios that are convenient and prop up their preconceptions. So for conservatives, it’s pretty much self evident that without Reagan, the Cold War would have continued and might well have gotten much worse because the Soviets would’ve seen weakness and they would’ve pushed further. And for liberals it was pretty obvious that the Soviet Union was economically collapsing and that things would have happened pretty much the way they did and Reagan managed to waste hundreds of billions of dollars in unnecessary defense expenditures. So you get these polar opposite positions that people can entrench in indefinitely.

Robert Wiblin: Yes, because they’ll never know. So it’s like you have a free hand to be particularly ideological about these cases.

Philip Tetlock: Right. It’s as if you’re doing clinical trials in medicine and you’ve got to make up the data in the control group. You never had to actually run the control group, you just say, “Let’s just make up the data and we’ll… You know what, all of our treatments are working lo and behold.”

Robert Wiblin: So to get a large enough sample to figure out how accurately people can assess counterfactual outcomes or how well they can do the comparison I guess, do you have to have hundreds of people playing Civilization V for many years and then making lots of predictions about different scenarios? What’s the scale of the enterprise here?

Philip Tetlock: Well, that would certainly be one way of doing it. I think that the research sponsor IARPA is not quite that patient, I think they would like to see a steeper rising learning curve and probably a smaller number of participants generating the learning. My hunch is that to do really well in this game is going to require a mixture of content knowledge, we need people who are really good at playing Civ5 who have a strategic savvy, but also people who are aware that strategic savvy doesn’t readily translate into forecasting skill. So we’ve certainly found that there are some people who know a lot about Civ5 who don’t necessarily make very accurate forecasts. And there are other people who don’t know that much about Civ5 but knew some pretty simple statistical rules and do reasonably well. But I don’t think we’ve reached the optimal performance frontier by any means, I think we’re lagging and if people would like to volunteer and they should feel free to contact me. If people have knowledge of Civ5 and would be willing to donate 20 to 40 hours of time playing in a pretty intense research competition in the middle of August and in return for somewhat modest compensation but not nontrivial, then they should feel free to contact me.

Robert Wiblin: Because normally people don’t get compensated at all for playing computer games, so I think that’s a step up.

Philip Tetlock: Right. Well, they’re not playing a computer game, they’re actually doing forecasting. They’re watching artificial intelligence agents that represent civilizations playing a computer game, and then the question is how skillfully can they make sense of that particular run of history?

Robert Wiblin: I see. So the humans don’t actually play the game, they just look at the scenarios and try to predict who’s going to win or what outcomes will happen? And I suppose that speeds things up a lot because the AIs can play much faster than people can.

Philip Tetlock: That’s correct. What you see is what they call a world report, you get to see how the game unfolded and the question is, how deep an understanding can you extract from seeing that history, how deep an understanding can you extract of the causal principles driving the game?

Robert Wiblin: Yeah. I was surprised when you said that there’s people who are good at playing Civ5, good at winning but not terribly good at making forecasts because it seems like in order to be good at playing the game don’t you have to kind of be good at making forecasts about what will happen in the game if you take different actions. It seems like they’re almost the same skill.

Philip Tetlock: I think that’s a great question and I was working with that more or less that assumption myself, but it seems that for the counterfactual questions that are being posed in a simulation that is as complex as Civ5 where the combinatorics are staggering and the number of possible states of civilizations and variables probably is greater than number of atoms in the universe, that even very skilled Civ5 players will have serious blind spots that can be exploited by clever question posers.

Robert Wiblin: It’s interesting because it’s possible that they have some kind of gut intuition about what action is going to be best but then between other actions that they’re not seriously considering they don’t have the trained intuition to handle those very well.

Philip Tetlock: We don’t have a detailed enough understanding yet of what exactly is going wrong. We think that the winning model is going to be human beings… in this competition by the way, machine learning is not allowed, so we’re well aware that, and IARPA is well aware that you could just put machine learning to the task and it could run the game millions of times and put AlphaZero in there and Demis Hassabis is going to be the world champion of Civ5 as well as countless other games. So we’re aware of that, that’s why the range of research we’re allowed to on Civ5 is so restricted. We’re only allowed to look at correlations essentially, we’re not allowed to do the counterfactual thought experiments ourselves. In a sense they’re putting us in the same position of ignorance as actual intelligence analysts would be when they’re trying to make sense of the world.

Philip Tetlock: When intelligence analysts are doing a postmortem on policy toward Iraq or Iran or any other part of the world, they can’t go back in history and rerun, they have to try to figure out what would have happened from the clues that are available. And those clues are a mixture of things, some of them are going to be more beliefs about causation, the personalities and capacities of individuals and organizations, others are going to even be more statistical, economic time series and things like that. So it’s going to be a real challenge. I mean this is research in progress so inevitably I have to be more tentative and I’m speculating, but I’m guessing we’re looking for hybrid thinkers. We’re looking for thinkers who are comfortable with statistical reasoning, but also have a good deal of strategic savvy and also recognize that, oh, maybe my strategic savvy isn’t quite as savvy as I think it is, so I have to be careful.

Robert Wiblin: Do you have any view at this point on how likely we are to be able to generalize from Civilization V to the real world?

Philip Tetlock: Well, that’s the really big question because I think it’s safe to assume that the US Intelligence Community is not all that curious about who’s better at forecasting in the Civ5 computer games, I think the hope is that if you can identify methods of enhancing accuracy, enhancing the performance of human teams making sense of Civ5 games, that those methods will transfer to better performance in the real world. Now how exactly you make that inferential leap, that’s a complicated question.

Robert Wiblin: Can I venture to suggest that it actually might not be too bad because although it’s kind of a simplified version of the real world… Civilization, the gameboard is a very simplified version of things that are actually going on in the real world, to some extent when we try to do forecasts we have to create this simplified schema in our own minds that might well end up kind of resembling something like the board on Civ5 and try to map out the plays that people can make. Well, we can’t model the full complexity of the world, all we can model is something that’s on the level of the simplicity of Civ5. So maybe it’s like yeah, even if the real world is different, it’s going to be not so dissimilar from how we actually try to make forecasts.

Philip Tetlock: I think that’s fair. I think Civ5 world has three basic similarities with the real world. One is the complexity of causation. You’ve got many variables influencing many other variables and you have feedback loops along variables. So you have negative and positive feedback loops and you’ve got interactive causation, complexity of causation is a big similarity. Another similarity is path dependency, that once you’ve gone down a certain path you can’t go back. So there are certain categories of effects which compound once you’ve made certain moves. Some events are irreversible in their consequences or extremely difficult to reverse. And then finally a randomness or stochasticity. The artificial intelligence agents have a certain amount of randomness built into their play and that’s probably an essential property to prevent them from being very easily exploitable, so there may be some game theory sense underlying that. At any rate, we don’t know how much randomness there is woven into the AIs because we’re not allowed to look at the programming code.

Philip Tetlock: I mean, we’re not… and so these are all-

Robert Wiblin: So even you are not allowed to-

Philip Tetlock: So just as we don’t know how much randomness there is in our world, we also don’t know how much randomness there is in Civ5. So all these things make Civ5… when you set the ground rules right, I think the research sponsors have done a good job of setting up the ground rules in a sensible way. When you set the ground rules up correctly, you do sort of put the players, the observers of Civ5 in a position of ignorance, somewhat similar to the position of ignorance that real world analysts are in. The big difference is that in Civ5 you know what the ground truth is, you know what happens in the what if worlds. We don’t get to see that until they give us the feedback and whether we’re right or wrong.

Robert Wiblin: Yeah. Do the forecasters get to see the entire gameboard or just the fraction that one player would be able to see?

Philip Tetlock: They get to see the entire gameboard.

Robert Wiblin: Okay. Interesting. Yeah, so it’s a bit like… yeah. Okay, so they’ve got access to kind of cable news, they’ve got access to satellite data, that kind of thing.

Philip Tetlock: Yes, they do and the whole thing is complete when they see it. They see the game, it’s all there to be read and you can see return zero to 500 or-

Robert Wiblin: Yeah. Let’s move on from the Civ5 competition though hopefully there’ll be some Civ5 addicts out there who can get in touch and potentially participate.

Philip Tetlock: I really hope there’s some. It’s a serious cognitive challenge. It’s something that I don’t think anybody has ever done before trying to improve counterfactual reasoning in a simulation in the hope that we can make that stick later on in the real world.

Robert Wiblin: Yeah. I think listeners here have a pretty high demand for cognition, so might be well suited to it.

Robert Wiblin: If you know Civ5 really well and are interested in spending many hours seriously testing your subjective-probability forecasting skills you can fill out a sign-up form which you can get to at 80k.link/civ. We’ll also link to it in the show notes and the associated blog post.

Robert Wiblin: When we last spoke about 18 months ago, you’re just launching this hybrid forecasting competition, which I think aimed to pair algorithms with human forecasters and and see how well they did and you were looking for people to participate. How has that one gone? Are there any kind of early achievements or findings coming out of that research?

Philip Tetlock: Oh, well that wasn’t my competition. That was an exercise that I was helping IARPA to recruit people for and I know that there are some interesting results. I’m not sure how much IARPA wants those to be talked about right now, so I should be careful. I would simply say that I think it’s been very difficult for algorithms to get a lot of traction on the kinds of questions that IARPA is interested in asking about. So if you’re talking about using… now we are talking about using machine learning and what are the types of questions in the real world where machine learning becomes useful and what are the types of questions where it flounders? And the short answer is, of course, the more data-rich the world, the more likely machine learning is to get traction.

Philip Tetlock: So if we’re trying to predict macroeconomic statistics for OECD countries, we have long time series, we can look at how the variables change over time, we can also look at how the time series are intercorrelated with each other, we can see what the lags are, we can create complicated econometric models. You can do a lot of interesting things and machine learning might be able to get some traction, reasonable traction vis-a-vis human forecasters there. But if you’re talking about trying to predict the outcome of the Syrian Civil War or indeed relations with China and the South China Sea or the state of negotiations with North Korea or the US China Trade War or the state of the eurozone and Brexit, all these things the machine learning people I think quite rightly kind of roll their eyes a bit and they say-

Robert Wiblin: “We don’t have enough training data to make sense of this.”

Philip Tetlock: We don’t have enough training data to make sense, yeah, exactly. The base rates are elusive, the covariation structures are elusive, so it’s hard for us to even get started. Now, of course, those questions are also very hard for humans. They may be impossible for machine learning and very hard for humans, but humans are able to do, I think significantly better than zero. Which suggests that the jobs of certain categories of human beings may be secure for a bit longer, it may be that if you’re a loan officer in a bank, your usefulness is highly questionable in a machine learning world, whereas-

Robert Wiblin: The CIA operatives are.

Philip Tetlock: The geopolitical analysts, they might have a longer future. Now of course, if loan officers in banks are serving another kind of function, if they’re serving a political function and it’s supposed to be sending, giving money to friends and doing this or that, then the loan officers can rest secure. But if they’re playing a pure profit maximization game or accuracy maximization game, then-

Robert Wiblin: Maybe not.

Philip Tetlock: Yeah.

Robert Wiblin: Okay. Yeah. We’ll return to the algorithms a little bit later on, I’m keen to learn more about how they’ve done in your research. Yeah, as I was prepping for this interview, I was looking back over some of your work and a point that I think had stuck with me over the years was this observation that people seem to kind of have only three probability settings, or people who haven’t been exposed to forecasting and probabilistic reasoning they kind of think things either have 0% probability, they definitely won’t happen, or they have kind of a 50% probability, they might happen, but we don’t know, or they’re 100% likely and they’re definitely going to happen. Is this a kind of a general finding that many people kind of reason in that way they flip between these three different likelihoods?

Philip Tetlock: Yeah, it’s… was a joke that I first heard from Amos Tversky in the 1980s that people could only do that, but it was a joke, it wasn’t intended to be a description of a serious research finding, but it is a stylized fact that people have a hard time making subtle distinctions in the maybe zone and they do gravitate toward yes and no and certainty. We’re ambiguity averse and we have a hard time making subtle distinctions along probability continuums. So I think that’s fair and I think that the best forecasters are able to resist that and they’re characterized by a capacity to make many more than three degrees of distinction among uncertainty. The best forecasters, in a paper by Jeffrey Friedman, Richard Zeckhauser and Barbara Mellers and others did, I was part of that team, the best forecasters we find are able to make between 10 and 15 distinguished… between 10 and 15 degrees of uncertainty for the types of questions that IARPA is asking about in these tournaments like whether Brexit is going to occur or if Greece is going to leave the eurozone or what Russia is going to do in the Crimea, those sorts of things. Now, that’s really interesting because a lot of people when they look at those questions say, “Well you can’t make probability judgements at all about that sort of thing because they’re unique.”

Philip Tetlock: And I think that’s probably one of the most interesting results of the work over the last 10 years. I mean, you take that objection, which you hear repeatedly from extremely smart people that these events are unique and you can’t put probabilities on them, you take that objection and you say, “Okay, let’s take all the events that the smart people say are unique and let’s put them in a set and let’s call that set allegedly unique events. Now let’s see if people can make forecasts within that set of allegedly unique events and if they can, if they can make meaningful probability judgments of these allegedly unique events, maybe the allegedly unique events aren’t so unique after all, maybe there is some recurrence component.” And that is indeed the finding that when you take the set of allegedly unique events, hundreds of allegedly unique events, you find that the best forecasters make pretty well calibrated forecasts fairly reliably over time and don’t regress too much toward the mean.

Robert Wiblin: When I was reading the stylized fact of like, yeah, people think… people are drawn to things if it’s 0%, 50% likely or 100% likely. I was wondering whether that tendency might be able to explain some kind of weird behavior that I observe in people. So one is that it seems quite common for people to kind of have a relatively uninformed view about something but become extremely confident about their kind of split… their quick judgments about it even though if they really sat down and thought about it and realized that there’s so much that they don’t know, which kind of seems like they can have some evidence that gets them to 80%, 90% likely or like 90% confidence and then they just kind of push it up to 100 because they can’t be bothered thinking about this anymore. And then you’ve got the… yeah, these people who are very under confident about the ability to draw distinctions between like 40% likely and 60% likely, who kind of get stuck in this maybes and they’re like, well it’s unknowable, it’s like it might happen, it might not happen, and they kind of miss out on the opportunity to draw these most of the distinctions between likelihood.

Philip Tetlock: Exactly. And I mean, take an issue that is politically polarizing in the United States, such as climate change, and forecasts of how rapidly the climate is changing as a function of greenhouse gases and perhaps other factors. Would I be considered to be a climate… am I a believer in climate change or am I disbeliever, a denialist… as it were, If I say to you, “Well, when I think about the UN intergovernmental panel on Climate Change Forecast for year 2100, the global surface temperature forecasts, I’m 70%… 72% confident that they’re within plus or minus 0.3 degrees centigrade in their projections.” And you kind of look at me and say, “Well, it’s kind of precise and odd,” but I’ve just acknowledged I think there is a 28% chance they could be wrong. Now they could be wrong on the upside or the downside, but let’s say error bars are symmetric, so there’s a 14% chance that they could be-

Robert Wiblin: Underestimating.

Philip Tetlock: Could be overestimating as well as underestimating. So I’m flirting with the idea that they might be wrong, right? So if you are living in a polarized political world in which expressions of political views are symbols of tribal identification, they’re not statements that, oh, this is my best faith effort to understand the world. I’ve thought about this and I’ve read these reports and I’ve looked at… I’m not a climate expert, but here’s my best guesstimate. And if I went through all the work of doing that, and by the way, I haven’t, I’m just…] this is a hypothetical person, I don’t have the cognitive energy to do this, but if someone had gone to all the cognitive energy of reading all these reports and trying to get up to speed on it and concluded say 72%, what would the reward be? They wouldn’t really belong in any camp, would they?

Philip Tetlock: The climate proponents would kind of roll their eyes and say, “Get on board. You’re slowing down the momentum for the cause by giving sucker and some emotional support to the denialists,” and denialists will say, “Well, you’re kind of suckered by the believers.” You’re not going to please anybody very much. You’re not going to have a community of co-believers with whom you can comfortably talk about climate change in the bar. You’re going to be weird, you’re going to be an outlier.

Robert Wiblin: Might be able to cobble together kind of four economists or something to have a beer with.

Philip Tetlock: Could be something like that, but there’s not a good intellectual home for you. And if you think that the major function of your beliefs is to help you fit into the social world, it’s not to help you make sense of the world itself, then why go to all the bother of participating in forecasting tournaments? And I think that’s one of the key reasons why forecasting tournaments are hard sell. I think people… forecasts do not just serve an accuracy function, people aren’t just interested in accuracy, they’re interested in fitting in, they want to avoid embarrassment, they don’t want their friends to call them names, I don’t want to be called a denialist or a racist or whatever other kind of thing I might be… whatever the epithet you might incur by assigning a probability on the wrong side of maybe.

Robert Wiblin: Speaking of climate change, often when I go into the media, kind of every week, it seems like there are new wild predictions being made about how bad climate change could be, which I guess sometimes sounds suspect to me, but I’m not a climate scientist and I don’t really have time in my day-to-day work to look into how scientifically grounded these forecasts are. Yeah, I guess you might have encountered these forecasts as well and those… there are also people who claim that it’s not going to be a problem at all. How do you kind of disentangle a problem like that in real life?

Philip Tetlock: Well, the first… one of the things I’ve learned to do over all this work is never pretend to be a subject matter expert in anything my people are forecasting on. So I’m not an expert in North Korea, I’m not an expert on the euro, I’m not an expert on Columbian narcoterrorism or Syrian Civil War and I’m not an expert on the climate either. Now I think there is an issue of people feeling that, especially on the climate activist side, that the only way for them to build up political momentum for getting people to make sacrifices in the long-term is to get them to believe that things are going to hell in a handbasket right now and that floods and tornadoes and hurricanes and whatnot are unprecedented. I know that there are other people who say, “Oh, that doesn’t look so unprecedented to us.”

Philip Tetlock: And that’s a debate that’s completely apart from the larger question of warming, the long-term warming trend as a function of greenhouse gases is a question of well, how rapidly would hurricanes increase, or have hurricanes increased at all over the last 150 years? I don’t know what the answer to that is, I know that there are people who disagree about it. It’s really important, I think, I’m a process guy, I don’t want to have too many opinions. It’s not useful for me to have too many opinions, who would care what my opinion on this is? Why should anyone care what my opinion is? But I do know that there are incentives for people to exaggerate and that happened over and over and that happened and you’re much more likely to exaggerate when you’re not in a forecasting tournament, you’re not playing an accuracy. When you’re playing a political power maximization media exposure game, exaggeration is the way to go. If you’re in a forecasting tournament and you play that way, you’re going to get creamed.

Robert Wiblin: Yeah. I guess I wasn’t so much asking about climate change specifically, but I suppose again, maybe I have a rule of thumb that when advocates on a topic are speaking then I’m a lot more cautious about believing anything that they say, and I suppose it’s true of climate change, it’s true of many different issues that it can be like… but that means, it means that they could be right and I was just like, it’s very hard for me to figure out, it means they could make some big mistakes potentially.

Philip Tetlock: I think exaggeration adds to the noise and I think it’s probably shortsighted for advocates, for activists to exaggerate, but I understand the temptation.

Robert Wiblin: Yeah. I think this, it seems to me that it’s kind of a growing number of issues or maybe it’s always been this way where it’s like believing that something is absolutely certain or absolutely that can’t be true is like a shibboleth for participating in a particular group and kind of anyone who expresses doubt is liable to be condemned. Anyone who’s just like, well, I’m not quite sure about something, that’s kind of not an acceptable view.

Philip Tetlock: Right. And since forecasting tournaments are almost the opposite of that, I mean forecasting tournaments, if you put a probability of one on something and it doesn’t happen, your credibility gets hit hard. Or a probability of zero and it does happen, your credibility gets hard. If you had been more moderate, you’d take a much lower hit. So forecasting tournaments incentivize people to do something that is not altogether cognitively natural, and that is put a priority on accuracy, accuracy, and only accuracy, and you really don’t care about whether you’re-

Robert Wiblin: Political loyalties or forming the right alliances.

Philip Tetlock: However the chips may fall. It’s just about accuracy. And in principle, there are government agencies that are supposed to do that, right? There’s the Congressional Budget Office which is supposed to be absolute straight shooter, nonpartisan. The intelligence community is supposed to be nonpartisan, straight shooter, just the facts man. There are lots of bodies… the courts are supposed to be… there a lot of bodies that are charged with serving a purely epistemic function and it’s extremely difficult and their credibility is often called into account

Robert Wiblin: Just carrying on with this, with this idea of people kind of rounding to 100% or 0%, I’m particularly interested in kind of risk management, like global catastrophic risks and trying to prevent them. And I guess I often encounter people who think that a risk is unlikely and then feel… and then they’re like, “So I don’t think it’s worth working on at all or at least I’m not worried about it at all”. And it seems in those cases they’re kind of rounding down from like a 1% probability to a 0% probability for whatever reason that I’m not quite sure about. And so they miss that something… if something is really significant, then even if it’s only 1% likely, it could nonetheless be something that deserves a lot of attention. And I think another thing that people miss when they do this rounding down to 0% is that they miss that something that’s like 3% likely, it’s like three times, might deserve three times as much attention as something that’s 1% likely and something that’s 1% likely might deserve 10 times as much attention as something that’s like 0.1% likely. And yeah, it’s just like not having these fine gradients of probability I think can lead to really, really big misjudgments on some of the issues that I care a lot about.

Philip Tetlock: Yeah, I think that’s right. And I think that in the original Kahneman-Tversky prospect theory paper, I think the original was 1979, I think they have an interesting line in there about how the probability weighting function in prospect theory is ill-defined at the extremes. Which means that people are going to do one of two things. They’re either going to ignore very small probabilities or they’re going to dramatically overweight them and they’re going to oscillate between those two mistakes, it’s not as though… they’re going to have a very hard time getting it right. And that when the low risk thing is very salient, you know where it’s going and when it’s not that salient, goodbye.

Robert Wiblin: Yeah. It seems like it’s one of the areas where it’s hardest for humans to act rationally and hardest to coordinate to act rationally because kind of particularly gripping things like terrorism really are salient that they get a lot of attention and there can be like other really big risks that people just don’t think about and so they get neglected a lot and it’s like things that are like 40% likely or 60% likely, people, through experience I guess, learn how often they happen, but things that are like out in the tails it’s just so hard as a society to kind of appropriately apportion out our attention to those things.

Philip Tetlock: Well, one of the great challenges here is extending assessments of accuracy to low probability events because when you’re… as the events descend into very low probabilities, you might expect them to occur only once every three or 400,000 years and that requires a lot of patience from the research sponsors. So the question is, if you don’t have accuracy criteria for some of these extreme tail risk sorts of events, what metrics do you have aside from faith or resort to the precautionary principle or something of that sort?

Robert Wiblin: Well, I mean you could just try to make, form an inside view, just try to have a good understanding of the world and try to assess the probability, which is difficult, but I think you can do better than random.

Philip Tetlock: Well, here’s one thing you can do, well, you can create categories of risks that have higher probability and you can assess those categories. You can decompose the categories and you can see how logically consistent people are with their judgments of sets and subsets. So if you think your likelihood of dying in a car accident is greater than your likelihood of dying in a car accident and all other causes, we know that there’s something wrong with your probability judgments even though we have no idea whether you had the probability right or wrong. So there are logical consistency checks on probabilities at the extreme range that can be implemented and I think it is useful to implement.

Robert Wiblin: Let’s turn to something a little bit more prosaic, which is kind of forecasting in one’s personal life or career. So because obviously anyone who kind of engages really seriously with 80,000 Hours’ advice and tries to apply it to their life at some point has to kind of make some potentially very difficult forecasts about how things might pan out for them. So you can imagine someone who’s just finishing their undergrad and they’re trying to decide whether to start a PhD, who kind of will probably want to focus on what’s the odds of me actually finishing the PhD versus dropping out? And then if I do finish the PhD, what are my odds of getting an academic position that’s actually worthwhile? What’s my probably of using it? Things like if I apply for a prestigious job in the civil service, what’s the likelihood that I’ll get it? And if I start a business and I try to make a lot of money to donate by starting a business, what’s the probability that this business will take off? And then how big will it be?

Robert Wiblin: Like all of these things are potentially very important or very decision- relevant, but quite hard to estimate. I was thinking possibly we could try to run through how someone might estimate the odds of successfully becoming an academic when they’re finishing an undergraduate degree. Just cause that’s kind of potentially the career path that you’re most familiar with as a professor yourself. Is that something that you feel game to try?

Philip Tetlock: Sure. But it’s gonna come with a big caveat. These forecasting tournaments, we have people forecasting things over which they have no control. So they’re strictly observers. That’s true for the simulation worlds. And it’s true in the real world too. I mean, if forecasters are making forecasts about the euro zone or North Korea or Subsaharan Africa and you know, bets about epidemics or financial panics or military clashes, whatever it is. They’re making forecasts about things they don’t control, that they’re in the role of dispassionate observers. When you talk about making predictions about your behavior and the behavior of people with whom you’re in frequent interaction, you know, your spouse and your coworkers and so forth. When you’re doing that, you’re no longer just a forecaster, you’re a player. So there’s a story I’m fond of which my coauthor of Superforecasting, Dan Gardner, told me about. It was… I think it was an NHL team in Canada that was…

Philip Tetlock: Ottawa Senators maybe who had fallen behind in- It was either the run up to the Stanley Cup, the championship. They’d fallen behind three to one in the best of seven series. And some reporter runs up to the coach after they just lost the last game, the most recent game and said, “Hey coach, you think you’ve got a chance?” And the coach says, he pauses and he actually thinks about it. Fatal fatal career mistake by this coach, he pauses and he thinks about it and says “Probably or probably not.”

Philip Tetlock: Coaches are not supposed to talk that way. They’re supposed to say, “Of course we’re going to win” because they need to infuse their team with enthusiasm because they’re not just making a probabilistic forecast. It serves a different function because they’re not playing an accuracy game. Now they’re playing a confidence and fusion game. They’re playing a political mobilization game or an action mobilization like the climate people too. I mean it’s mobilization, it’s not just about accuracy. So there are all these games that people play, you know, about how long is your marriage going to survive? And does your athletic team have a chance? Is your career and graduate school gonna go into the basement? And so there’s these countless questions that could either become self-negating or self-fulfilling prophecies in your life. And it’s a matter of how you as a human being and with your values, how you make decisions about what you are…

Philip Tetlock: And are not going to believe who you are or are not. And the forecasts are now existential statements about identity and who you are. I’m the kind of person who is really gritty and I’m gonna make this work against the- I’m gonna overcome the odds. I transform the odds. I’m not- I don’t or- and I’d put it a little bit differently. I’m going to make history. I’m not gonna- I’m not forecasting. I’m a maker of history. And Karl Marx had an amusing line also to that effect that he said that “The purpose, you know, of my work is not to understand the world, it’s to change it. To change it.”

Robert Wiblin: So I think it probably does pay to be a bit optimistic that the plans you’re making are going to work out. But perhaps before you decide what the plan is going to be, I guess you’re in this difficult spot where you want to do kind of dispassionate forecasting of the merits of different options before you go into them. And then once you commit to one, then you’re just like, you’re all in potentially and like, or at least you’re like somewhat overconfident about how it’s going to pan out so that- Cause that will just drive you forward and I guess convince other people to join you as well.

Philip Tetlock: I think that’s a very useful distinction. And there are some people who do work on that and they say that there’s a deliberation mindset where accuracy matters and there’s an implementation mindset when you just do it. And some organizations have a crisp distinction between deliberation and implementation. The militaries do and businesses and people for that matter. I think. So, you know, there’s a time to think there’s a time to act. Now of course the division isn’t that simple because at some point you have to review and reassess whether you made the right decision. So there’s gotta be some updating going on, unless you’re a complete fanatic. If you never returned to deliberation, you’re only in implementation mode. Henceforth, you have slipped over into the domain of fanaticism.

Robert Wiblin: So, we’ll come back to try to do the forecast in just a second. This is a little diversion, but are you familiar with that freakonomics experiment suggesting that people don’t quit as quickly as they should? That when they did a randomization experiment where they would flip a coin and encourage people who got heads to quit whatever thing they were thinking about quitting and people got tails to not. And they found that the people who got got heads and were told to quit did actually quit more often and their lives went better, or they reported that their welfare was higher three and six and maybe 12 months out. I guess it might be possible to explain this by the idea that people realize that they have to kind of overcommit or become overconfident about whatever, whatever track they’re on, which means that they are likely to kind of stick with it potentially a bit too long if it’s going badly, if things are like going below expectations that they are going to be a little bit blind to that and potentially deliberately.

Robert Wiblin: And so if they get kind of forced out of it or by something like a coin flip, then that is kind of beneficial. Although if they hadn’t, I guess if they’d never kind of been overconfident, maybe then they never would have had the chance of making it through something difficult.

Philip Tetlock: Well Steve Levitt’s a very clever guy and I think that’s a- I’ve heard about that result. I haven’t read the experiment. It’s intriguing.

Robert Wiblin: Yeah. All right, let’s get back to the ‘becoming an academic’ example. We will probably, we’ll use some probabilities here, but that’s, that’s not so much the point as to try to like demonstrate how you might go about what procedure you might use for estimating the likelihood. So someone’s finishing undergrad. Their ultimate goal is to become an academic who’s doing really valuable research and I guess there’s multiple different kind of folders that they have to get through. They’ve got to get into a good PhD program, probably at a university that has a record, ideally of kind of placing people. They’ve got to finish the PHD. Then I don’t know, probably end up getting a postdoc or probably end up getting an academic position. And then having done that, like what are the odds that they’ll have the freedom or the funding to do something that they actually think is useful for the world. Do you have any thoughts on how someone might try to put this together to try to estimate whether it’s worth setting out on this path to start with?

Philip Tetlock: Well, it’s a tough racket and it obviously matters a lot. What field you choose. So you know, your prospects are much better in computer science than they are in English or history. So do I have anything more perceptive than that to say?

Robert Wiblin: So I guess what we typically recommend that we do is start by looking at the base rate.

Philip Tetlock: And those are base rates.

Robert Wiblin: Yes. So kind of look at how many PhD graduates… We would try to find these numbers for some fields. So it’s actually surprisingly difficult to find nicely comparable data across different disciplines. But if you look at the number of PhDs that are being mentored in these different fields each year and then look at the number of the actual academic jobs like maybe in total or that are opening up, or positions that become available each year and kind of look at that ratio, very often it’s like a few percent potentially. So you can expect that only a relatively small fraction of PhD grads are, or at least getting tenure or like research-focused academic positions. So that I guess you kind of always recommend starting with kind of the outside view or base rates or at least usually doing that. Do you agree that’s probably the way to go in this case as well?

Philip Tetlock: Yes, I do. Graduate School is a risky life. Academic life can be very rewarding, but it’s a hard line of work to break into and the way the academic labor market has become stratified with adjuncts for example. I think that’s made it harder. I think there are fields in which there is robust demand still in STEM disciplines but elsewhere it’s increasingly a long shot.

Robert Wiblin: So I guess- I think it’s actually not entirely obvious that you always want to start with the base rate or at least not for such a broad reference class cause there’s this possibility that, and I think there’s probably is the case in some fields, that almost all of the probability is going to like a relatively small fraction of PhD students or PhD graduates. There’s some fields I guess like economics which seem very top-heavy or they will at least just one that I’m kind of familiar with where like if you’re not at one of the top 10 or 20 economics programs, then you’re probability of getting a research focused economics position drops pretty precipitously and so you can potentially get misled by it by starting with the base rate. That’s like looking at it from the big picture.

Robert Wiblin: I think there’s also been some calls among people I know for organizations in the effective altruism space to publish the number of people who are hired versus the number of applications they got for a job to try to help people to decide whether to go through the effort of actually applying for jobs. And I do worry there that sometimes just giving such a broad base rate could be misleading because the reality is for like you know half of the applicants their probability of getting a job is kind of close to zero. And for some other people they’re probability of getting the job might be really quite high and just giving this thing of like “Well, on average it was a 1% likelihood of getting hired” could lead people astray.

Philip Tetlock: Absolutely. Picking the right base rate is a very valuable forecasting skill and life skill and you know a lot more about yourself than the base rate. And you know a lot more about yourself even than your GRE scores or your undergrad institution. So yes, you’re going to update in response to that information. And once you’ve been in graduate school, you know how much you’ve published and most people in grad school I think have a fairly good sense for where they rank. I don’t think there are very many grad programs that actually rank-order their students. The department faculty get together and say, these are the five people we really want to push this year. But I think there probably is rough agreement often in departments about who the most likely-to-be hired people are. And there are surprises. But I’m going to think that there, maybe I’m an overconfident faculty member here, but I’m going to guess they can do very substantially better than chance.

Robert Wiblin: Yeah. I think when people try to narrow down from such a broad base rate though. they are in a little bit of a bind because you were saying that in a sense you know a lot more about yourself than the base rate does or unlike other people do. But there’s another sense in which it can be like very hard to judge your own abilities cause I just find-

Philip Tetlock: That’s also true.

Robert Wiblin: Yeah it’s like it can always be like hardest to look yourself in the mirror because well one thing is you see yourself from a different perspective. Then you see other people and there’s like all these biases that had been like- I just know so many people who seem like both extraordinarily overconfident and extraordinarily under confident about their own abilities.

Philip Tetlock: Yeah, that’s a very interesting question of how accurate our self perceptions are. A lot of the literature does indeed focus on biases like overoptimism or overconfidence and also even defensive pessimism and underestimating. So it’s a very interesting question how exactly how accurate people are. I’m going to wager and I don’t know the facts on this, but I’m going to bet that people are moderately accurate.

Robert Wiblin: Oh, on average.

Philip Tetlock: Yeah, moderately. Which means I think that it’s useful information to add in.

Robert Wiblin: Yeah, yeah.

Philip Tetlock: Let me put it this way, another way to put it. If you really are delusional, your chances of success in this domain are extremely small.

Robert Wiblin: This has been, it’s been a really difficult area to know exactly what to recommend that people do. I guess it seems like probably the best shot you might get is if you can kind of get close to a mentor or someone who’s already in the field who can get to know your abilities relatively well. And then if you can somehow get them to give you a really frank assessment of what your odds are, that might be among like the best forecast that you’re likely to get. If you do a research project with an academic as an undergrad and then they can give you some sense of whether you potentially have what it takes. I mean even there its gonna be very difficult cause you’re potentially young, maybe you’ve matured during the PhD and of course people hate to break people bad news.

Robert Wiblin: So, I think in some fields there is a bit of a habit of I think academics leading people on by saying, “Oh yeah, go do the PhD” because, well, it’s like, it’s nice to have students under you who can do some of the grunt work as PhD students. So it’s like, yeah, there’s a bit of a selfish motivation there, but also just you want to be positive to people and you want to encourage them. So you kind of have to wonder, yeah, you always have to kind of judge are people telling it to you straight.

Philip Tetlock: What an interesting idea of… I mean faculty do owe that to students to give them candid feedback. But on the other hand, they don’t want to demoralize people and graduate students are somewhat easy to demoralize.

Robert Wiblin: Yeah. It’s hard on your mental health because the feedback is often so weak.

Philip Tetlock: It is. So it’s a very difficult problem. But establish a social contract with a faculty member whose judgment you really trust. Or even better yet, two faculty members whose judgment you really trust and say to them, “Look, I know there’s a non-zero chance that… I’m going to work- I’m going to do my best. I’m really, I’m gonna do my best on this project. And if you conclude at the end of this that you know your best- You’re pessimistic about your prognosis is somewhat grim. I really would appreciate it. I’d be grateful for the rest of my life if you just give me that honest feedback.” That’s an interesting- I’ve never- No one’s ever approached me like that.

Philip Tetlock: But it’s an interesting thought experiment. You know, it reminds me of an old joke about Henry Kissinger and Alexander Haig who was kind of an underperforming underling at one point and under Kissinger, and he would deliver a report to Kissinger and the next day he’d come back and he’d say, “What did you think Mr Secretary?” And Kissinger would look at it at his desk and he’d throw it back at him, “Do it again”. He’d come back the next time they’d do the cycle three more times. “Do it again. Do it again.” Finally, he’d come in and would say, “I’ve done my absolute best I can’t do it anymore.” He’d say, “Okay, I might look at it.”

Robert Wiblin: Yeah, I hadn’t heard that one.

Philip Tetlock: That’s an interesting approach to teaching.

Robert Wiblin: Yeah. It’s a bit disappointing if people aren’t even coming to you for an honest forecast. I mean, at least they might at least try there. I guess another approach that people can take is rather than look at the very broad base rate… Or someone’s kind of qualitative judgment based on knowing them, try to get quantitative data, which in some fields is more available than others. But you can kind of look at your GRE scores, look at your grade point average, look at your SATs. And then maybe look at what’s the typical entry GRE for people entering this field. What is the typical GRE for people who eventually got jobs in that field, if you can potentially find data on that. I guess that has the disadvantage that sometimes those quantitative measures can be a little bit crude and can throw out important information.

Robert Wiblin: But on the other hand it means that it’s a little bit harder to kind of delude yourself based on feeling that you’re special. It has a little bit more firmness to it and also you might possibly be able to find some extra data on what were the scores that academics actually got.

Philip Tetlock: Yes, that’s a very interesting… I don’t know what exactly the data is on this. But most people who make the cut into elite graduate programs have pretty high test scores and they have intelligence test scores that are comparably high because the two really are very closely related. And so let’s say the average IQ of, you know, students at an elite school and an elite program is 125 or 130, how much of an advantage would it be if yours was 150 160 or you had 800 800 800 GRE’s. I don’t know how that translates onto the scale, but I’m not even sure how common 800 800 800 are now, and it used to be quite uncommon, but they may be more common now.

Philip Tetlock: How much of a performance boost do you get? How much of a difference is there in the career effectiveness of lawyers or doctors or professors whose IQ is 130 verse 160 and how much is really driven by character as opposed to intelligence at a certain point. I’ve heard expert psychometricians argue this out and there’s one school that says there actually is a difference between 130 and 160. And there’s another one that said it’s almost totally driven by character. I don’t know who’s right, but it is an interesting debate.

Robert Wiblin: Yeah. I’ve seen one paper on this looking at research output and correlating that with IQ, which suggested that IQ did predict research output and discoveries. I can dig that up and try to find it. I haven’t scrutinized it terribly much.

Philip Tetlock: There’s a professor in the UK, you may know Stuart Richie. Yeah, I think he might- He probably knows. He probably knows the answer on this.

Robert Wiblin: Yeah. I’ve never gotten an IQ test because I feel like either I would end up being really smug or I’d end up feeling disappointed in myself. And then I’m also not- I feel like I’d end up disappointed really badly, but I’m not sure that I would have learned anything really useful about myself either that I didn’t already know. It’s like, I already kind of know what I’m accomplishing and what I’m not.

Philip Tetlock: I can assure you people my age should not be taking tests like that because there’s a fairly well known pattern that fluid intelligence peaks around 25 or 30. And now crystallization can continue to increase up to- even up to my age. Until memory loss starts to take its inevitable toll. But I think it’s a losing game for people over 50.

Robert Wiblin: Yeah, I guess.

Philip Tetlock: And I’m considerably over 50.

Robert Wiblin: I’m over the hump too, so it’s too late for me to do an IQ test now. Another case that comes up a lot is people trying to predict the likelihood of their businesses succeeding. We’re in San Francisco, lots of people are doing startups and are trying to figure out if their business ideas are any good. Do you have any experience or have you seen any research on whether people can predict that and maybe how they can do a better job of it?

Philip Tetlock: It’s the same base rate problem because most startups fail. And it depends on how exactly you define the base rate. I mean what population of small businesses are able to attract VC funding in Silicon Valley. If they’re able to pass that initial screen, they’re much better than some person who just decides to set up a restaurant randomly in a neighborhood. But even then, even after they pass VC screening, I think the base rates are pretty low. I bet- I suspect the VC’s like to keep this data pretty confidential. But I would be surprised if there are any shops that were able to achieve a one-in-three hit rate.

Robert Wiblin: Another practical approach that one might take with businesses and academic careers, inasmuch as you somewhat despair of like figuring out whether your odds are like 3% or 6% because it’s just too hard, is to find something that has kind of a fat-tail distribution of outcomes. Something where like if it goes well, it will go really well and you’ll have an enormous impact. And then find one that’s plausible or find one that’s appealing out of that and then pursue it until you get evidence that in fact it’s not panning out, that you’re not going to end up out in the tail. And then try to find another thing that has a fat-tail distribution and then give that one a crack.

Philip Tetlock: Right. And that’s why many- A number of VC’s have quite low thresholds for funding and in the hope of getting the next Facebook or Google, they can afford many, many dozens, many hundreds, many thousands of misses and still do magnificently. Now, obviously they still have to balance false positives and false negatives and they’re still aspiring to accuracy. But when you think that there is a fat-tail of extremely lucrative possibilities out there, you really don’t want to miss them and you’re willing to tolerate a lot of false positives in the question.

Robert Wiblin: I guess it’s a little bit more challenging for an individual, a person who’s just finished their undergraduate degree, cause maybe they’re thinking, “Oh, do I want to go into government or do I want to become an academic or maybe I want to go into business or start a nonprofit.” It’s like maybe they only get kind of three of those before they’re starting to hit their mid thirties and perhaps they’re not willing to be as adventurous as they used to be and if all three of them don’t pan out then, then I guess they have to have a backup option that seems a little bit safer that they can potentially move into. Unless they’re very adventurous.

Philip Tetlock: Well, there’s the human life cycle and there’s the desire to reproduce and all sorts of things that kick in and obligations to other people. People find themselves locked into things they didn’t expect to be locked into.

Robert Wiblin: Yeah. But I guess if you choose three kind of options with a lot of upside then I guess you’re giving yourself a decent chance. As long as you have like something to move back into later then probably have a pretty good life.

Philip Tetlock: Yeah. You know, it’s an interesting sequencing strategy. I have to confess, you know, I guess I wasn’t as creative in my life. I found academia a pretty comfortable place very early on and I didn’t really start to make serious contact with the real world until I was in middle age. The world started to pay some degree of attention to what I was doing. But otherwise I would’ve just been completely in academia.

Robert Wiblin: Yeah, pushing onto a slightly different topic related to people improving their ability to predict what’s gonna happen in their life and making good decisions. To go along with releasing this episode, we’re actually going to put up on our site, this calibration training tool that was developed or funded by the Open Philanthropy Project, which people can get at 80,000 hours.org/calibration_training.

Robert Wiblin: And just remind everyone calibration is the ability to tell when something feels like it’s 90% likely it does actually happen nine times out of 10 and when something feels like it’s 20% likely it does happen two times out of 10 so it’s kind of one of the two measures of a good forecasting ability. The other one being able to get away from the 50-50 probability towards like actually making strong claims about things that definitely will and definitely won’t happen.

Philip Tetlock: A very important component.

Robert Wiblin: Yeah. A very important component.

Philip Tetlock: That second component is not to be understated. I would say resolution is every bit as important as calibration. Some people might say more.

Robert Wiblin: Yeah, so have you seen this tool or kind of any other tools like it?

Philip Tetlock: I’m familiar with it and I think Michael Mauboussin has created something somewhat similar. And yeah, Good Judgment, the private sector spin-off from the Good Judgment Project, Good Judgment Inc, I think has probabilistic reasoning training that includes that as well.

Robert Wiblin: Yeah, I think they spent quite a few years developing this one. I’m not sure how it compares to the others, but it’s got like different kinds of training. You’ve got like confidence intervals. I think PolitiFact questions for people who are more political. You can try to guess city populations, answers to math problems and like the confidence on them and I guess like various kind of correlations.

Philip Tetlock: Oh wonderful. So you can assess transferability across domains.

Robert Wiblin: Yeah.

Philip Tetlock: That’s a really big thing because transfer has always been one of the most difficult challenges for psychologists designing training to overcome. Transfer statistics have tended to be disappointing.

Robert Wiblin: Yeah. How valuable do you think doing this is? I mean it seems like it’s probably worth doing it for a couple of hours, but it’s possible that because like the transfer between doing things like PolitiFact questions or yeah, math problems or like correlations between different kinds of social statistics might be weak with like the other kinds of questions that you try to assess in life. Like “What are my chances of becoming an academic?” that maybe you hit declining returns after a couple of hours.

Philip Tetlock: It could be turned into a useful research instrument, potentially. I would be curious to know, for example, whether people who have been randomly assigned to do this and have actually done it can generate more accurate conditional forecasts and good judgment open or insights like that.

Robert Wiblin: Has this ever been experimented with? Giving people calibration training and then seeing whether they do better in these tournaments?

Philip Tetlock: A little bit. The probabilistic reasoning training that we developed in the original tournaments was able to deliver performance boosts in forecasting between six and 12% over each of the four years. We called it Champs Know, the training module. And there was a brief calibration exercise, but nothing as extensive as you’re describing. And you know, we described what calibration was, we emphasized its importance. We gave them some examples of how people can be miscalibrated and some practice questions. But we didn’t do it exhaustively and we were addressing a lot of other points also like base rates and belief updating and information search, how to be a creative information seeker. So there were a lot of things that were in there.

Philip Tetlock: In a tournament. The priority is not on doing precise experimentation its on winning. So you take everything you think might work and you’ve kind of put it all together and it’s like throwing the kitchen sink at it, right? You’re- So tournaments really require careful followup experimentation where you try to triangulate in on exactly what worked because the investigators are typically doing a lot of things to win the race. But the short answer is yeah, I think people should look at it and maybe there’s some interesting collaborative potential there.

Robert Wiblin: Yeah, I’ve used it for a bit and I felt like I was getting more calibrated over time, though I was like reasonably good to start with, but that might’ve just been luck. I actually do it in my day-to-day life, like assign probabilities to things just every hour, like every half hour because it’s just how I think about acting in life. And it’s possible that has helped to calibrate me over the years.

Philip Tetlock: You’re becoming a bookie. Essentially so is there… Here’s another question though, are you measuring resolution? Are you giving feedback on resolution as well? If you’re only giving feedback on calibration, is there a danger that…

Robert Wiblin: We’ll start like just pinning to 50-50?

Philip Tetlock: Well not that would be too extreme, but there may be some implicit base rates lurking in there.

Robert Wiblin: Yeah, that’s interesting. Yeah. I’ll maybe bring that up with the people who made it before we launch it and see what they have to say about that.

Philip Tetlock: Yeah. I think giving the feedback on both calibration and resolution is a good idea because they are- In real life when you look at them, they’re correlated with each other. People who are well calibrated also tend to get good resolution scores and that’s not too surprising. Almost has to be. It doesn’t have to be, but it almost has to be, but there is a degree of tension. There are different styles and there are some people- there are some managers I think that really value decisiveness in their employees and do look for extreme answers and really value that. And in the more nuance 30 40 50 50 those people in the middle, will get less recognition. It’s an interesting problem.

Philip Tetlock: I mean, one of the things we looked at in our early work on Expert Political Judgment was whether the well-calibrated forecasters were just being cowards, right? So it rains 60% of the time in Seattle. So you always predict 60%, so you get a perfect calibration score. Whereas what you really want are people who say it’s a 95% chance of rain when it rains and a 5% chance when it doesn’t rain. And that gets you a great resolution score as well as being well-calibrated because you’re right. But when you say 95% and it doesn’t happen, you take a big hit so there is a trade off in people’s minds. I think if you tell people you’re judging them on both properties, it’s going to force them to be more mentally agile and they’ll be making more trade-offs in their head. They’ll say, oh well, I don’t want to be overconfident. On the other hand, I don’t want to be a chicken.

Philip Tetlock: So one of the critiques of my early work was when fox and more ‘foxy’ forecasters are better than the ‘hedgehogy’ forecasters. Oh well, you know the foxes are just chickens.

Robert Wiblin: Was that the case?

Philip Tetlock: No, but we had to address it statistically though.

Robert Wiblin: Okay, that’s interesting. So yeah, so they weren’t just being cowards. They were, as you were saying that like calibration and discrimination tend to be together.

Philip Tetlock: Well they were more moderate but they also did better on resolution so they didn’t buy calibration at a cost of degrading resolution below that of the hedgehog’s.

Robert Wiblin: So some friends of mine have been trying to produce kind of other useful, ideally useful kind of training content for a broad audience to help improve their reasonableness and their forecasting ability, I think actually you might’ve met one of the one or two of them here at EA Global today. Do you have any ideas for what the best opportunities are for producing training content that people haven’t already seen, which actually might allow people to become more accurate at forecasting within a reasonable timeframe?

Philip Tetlock: Well, we’re hoping that the work we’re doing now will focus on helping people become more rigorous lesson extractors from history and will translate into improvements that haven’t been observed before. That’s a promissory note though. That’s not something that we can say we’ve demonstrated. I guess you’re asking me beyond the training protocol we developed in the ace tournament known as Champs Know, do we have anything new to report on the training side that works and that delivers systematic, replicable improvements?

Philip Tetlock: And I think we have some hints that something works, but I don’t think it’s replicable yet. So I’m going to… Since we’re in the age of replication, we’ll just say maybe. Stay tuned. But it’s not easy. It’s not easy to do this. There’s a ‘curse the darkness’ phase of my career and a ‘light the candles’ phase of my career. And the ‘curse the darkness’ phase of documenting the biases that other people blazed the trail on was much easier than lighting candles. Improving judgment, I call it Meliorism, you know, a commitment to making things better is, is really hard work and frustrating. The failure rate of studies has been somewhat discouragingly high.

Robert Wiblin: So like in business school people often do kind of case studies. Can you imagine incorporating a prediction element where people find out things about in business situations in the past and then try to figure out whether they succeeded or failed? Could you imagine that kind of helping with people’s ability to make good business decisions?

Philip Tetlock: Yes. I think that what we’re doing now actually with Civ5 could easily be adapted to many business simulations. So the kind of training we’re doing, the kind of learning that people are engaging in is very similar. So you get one of these Harvard business cases, you know, if the CEO had done this rather than that, what would have happened. Now if it’s a, depending on the kind of simulation it is, if it’s a simulation of Intel or it’s a simulation of an actual company, the answer’s unknowable. We don’t actually know what would’ve happened. Although we often have some reasonable hints because of the, the market is a corrective force. At any rate it has promise.

Robert Wiblin: Yeah. A friend of mine, Danny Hernandez made this interesting point that we’d like to be able to figure out who super forecasters are really easily and quickly and cheaply because then we could give more weight to their judgment. But as it is using normal tournaments, it takes quite a while, both for them and for time to pass in the world before you can figure out whether someone is a super forecaster. Do you imagine that just say, measuring someone’s performance on a calibration test could give some indication of whether they might be a super forecaster or not?

Philip Tetlock: It might. Yes. Although I’d want resolution as well. And the idea of being able to screen people much more rapidly than waiting for two years of tournaments and seeing who regresses toward the mean. I understand the appeal of that. That would be a faster way of identifying talent and it may be feasible. I think it’s certainly worth trying.

Robert Wiblin: Yeah. Maybe someone out in the audience can try to adapt one of these tools for that purpose.

Philip Tetlock: I think it’s very reasonable… Businesses are somewhat constrained by their human resources departments and their legal departments and the kinds of studies they’re allowed to perform on their employees and the kinds of criteria they’re allowed to use as screening employees. So they have to validate tests that are used for employment and things like that. So it’s a nontrivial matter for a business to adopt. You know, I say kind of glibly when I talk about how the earlier forecasting tournaments, where one. step one is get the right people on the bus and that sounds easy but it’s not. It took a long time to figure out who the right people were, and an organization or a business thinking of doing this would probably be well advised to talk to its legal department first.

Robert Wiblin: I guess, yeah, maybe that’s something that individuals out there can have a go at. The tool that I’ve mentioned, the calibration tool from the Open Philanthropy Project is written in this pretty easy programming language called GuidedTrack that was actually developed by Spencer Greenberg who’s also been a guest on the show. So it would be relatively easy, potentially, to take some of that and modify it for like for related purposes. Like yeah, measuring people’s performance.

Philip Tetlock: That’s very interesting. Okay. I would like to see that.

Robert Wiblin: So let’s dive into some more technical questions about the forecasting research that you’ve done over the years. I asked online what questions people had, difficult questions that people have come up with when they’ve been reading your book or your papers, that they are curious to get answers to. And actually, the philosopher Daniel Kokotajlo, I hope I’m pronouncing that correctly, wrote up this really beautiful summary of the practical findings from your work for an organization called AI Impacts. Which is trying to forecast progress in AI and try to figure out how that might affect society. Actually, I think if someone was only going to spend 20 minutes reading about forecasting and your work, that’s probably the link that I’d give them at this point.

Robert Wiblin: So I’ll definitely stick up a link to that piece in the blog post with the show notes.

Philip Tetlock: I’m curious already.

Robert Wiblin: Yeah. I’ll send it to you as well. I asked him for a couple questions, and so a few of the gems in here are because of Daniel, so thanks to him. So he noted that there used to be this page on the Good Judgment Project’s website, that used to break down various different ways to get better forecasts and suggested that you got a 40% boost from talent-spotting forecasters, as you just mentioned, and a further 10% boost from giving them training tools, 10% from putting them on teams and getting them to talk to one another. And then maybe a 25% boost in accuracy from using algorithms to process and then aggregate their various different predictions. Does that ring true to you still today?

Philip Tetlock: These I guess I would have to characterize these as stylized facts, that the baseline here is the unweighted average of the regular forecasters. It’s true that once you’ve identified the super forecasters, and you put them into teams, they have a big advantage over regular teams and individuals working alone. That is true and is in the vicinity of 40%, yes. The training number of 10% is approximately right, the teaming number of 10% is approximately right. The algorithm number really conflates a couple of things. I mean, the algorithm number could be larger or smaller depending on how you calculate it. Since the aggregation algorithms are piggybacking on the most recent forecast of the best forecasters, that means they’re drawing on super forecasters.

Philip Tetlock: So the question is, how much better can the aggregation algorithms do if you just put the super forecasters out of the equation? And I think that number is about right, 25%.

Robert Wiblin: Another one is, in Superforecasting, there’s this quote, “The strongest predictor of rising into the ranks of super forecasters is perpetual beta, the degree to which one is committed to belief-updating and self improvement, perpetual beta is roughly three times as powerful a predictor as its closest rival, raw intelligence”. How did you measure it or define perpetual beta? I think Daniel couldn’t find that anywhere in the book.

Philip Tetlock: It’s a very good question. And the self-report measure of perpetual beta does not do that work for us. What does the work for us is a measure of the frequency with which people engage in belief-updating, low magnitude frequent belief-updating is a powerful driver.

Robert Wiblin: So one of the key measures of this will just be how frequent people update their estimates?

Philip Tetlock: I said three times more than one?

Robert Wiblin: Intelligence, which was I think, the second most important.

Philip Tetlock: Yes, fluid intelligence and well, crystallized and fluid both played a role, but fluid intelligence was the most consistent predictor. But when you’re dealing with the population… these forecasters are all pretty smart. So there is some restriction of range. So the comparison to intelligence is not entirely fair.

Robert Wiblin: Do you think that if you just drew people randomly from the population that intelligence might seem like a more important factor?

Philip Tetlock: Yes, I’m pretty sure that’s true. In the same way that if you randomly admitted people into the Harvard Department of Economics, GRE’s have become a much better predictor of who does well then. And then GRE’s are now, GRE’s probably predicts almost nothing in performance in graduate school at Harvard.

Robert Wiblin: Something that’s a particular of interest to me is, in a lot of experiments you’ve run, the extrapolation algorithms, just various different kind of brute-force forecasting methods or mechanistic forecasting methods, seem to be quite a lot different than human predictors. But there seems to be surprisingly little detail about the nature of these algorithms in the books, maybe there are in papers. But it seems like algorithms are a lot easier to manage than people and a lot cheaper to run potentially than super forecasters who you would have to pay salaries to.

Robert Wiblin: So maybe we should put just a bit less effort in trying to identify super forecasters and just put more effort into training people on how to do these extrapolation algorithms? Extrapolation algorithms are better than most people that maybe that we should be focusing our attention on, making it possible for ordinary people just in life to use these extrapolation algorithms that are actually performing pretty well.

Philip Tetlock: Like predict no change or predict the most recent rate of change?

Robert Wiblin: Yeah. Well, I guess, I’m curious to know, what were the algorithms that-

Philip Tetlock: Well, those-

Robert Wiblin: Those are the ones that did work well?

Philip Tetlock: Yeah, they worked pretty well.

Robert Wiblin: Interesting. Just predicted no change?

Philip Tetlock: Especially for the shorter term forecast, because change is less likely in the short term. I think one finding when you cross both books. One finding is that the people somewhat exaggerate change in the short term, but they understate in the longer term. But that’s not entirely true even there. I think they’re exaggerating change, even in the five year range. Okay, I’m thinking out loud. I should be careful what I say. Okay, this isn’t in a journal, we’re-

Robert Wiblin: It’s a conversation.

Philip Tetlock: But it’s a good question. It doesn’t take a lot of training to do that. I mean, you don’t need to train people do it all, you just have the algorithm do it. So I don’t see a need for training there.

Robert Wiblin: Yeah, okay. So these weren’t quite complicated forecasting algorithms that would require an expert statistician or something to put them together very often, they were just brute-force, like simple rules.

Philip Tetlock: Oh, my definition of an algorithm is anything that can go on automatic pilot, so it doesn’t need any human intervention. I think of heuristics as more as something that requires human judgment.

Robert Wiblin: Could you imagine producing forecasting rules of … Rules just predict no change over this timescale and for longer time scales, and just look at the long term trend and forecast that forward that would allow people to mechanistically just become better forecasters without having to go through all this effort of becoming more ‘foxy’?

Philip Tetlock: There’s a guy, Spyros Makridakis, who does statistical forecasting tournaments in which people are trying to predict, all the competitors are algorithms. And there are 10s of thousands of time series from politics and economics and business and so forth, all sorts of time series. And the question is, which ones perform better across very, very disparate data sets. And he also has machine learning in the competition sometimes, too. And he finds that a fairly simple, damped smoothing exponential time series works pretty well across an enormous range of time series.

Philip Tetlock: I honestly don’t know. I mean, time series data often have a lot of bumps, ups and downs. So smoothing simply means you’re smoothing out the big bumps. It’s like what they show you on with these Wall Street summaries, what’s the hundred 180 day moving average sorts of numbers. Yeah. Would that be a good idea? I think for many situations, it would help people do pretty well with it, bring them up to superforecasting levels of performance. Sometimes it would mean, it’s not as though the superforecasters themselves don’t use algorithms. I mean, they do. A lot of them are very statistically savvy, they know more statistics than I do.

Robert Wiblin: So to some extent they’re like a superset of some of these methods.

Philip Tetlock: Yeah, a subset of them run their own Monte Carlos.

Robert Wiblin: You’ve got an excellent friend there.

Philip Tetlock: I mean, the superforecasters are already hybrids in and of themselves, it would be a mistake to think of them all as human. Because they are using statistical aids of various sorts. So it’s a very exciting question. I don’t know the answer. My guess is, it would help whether it would bring them all the way up to super performance, I suspect not, but it’s an empirical issue.

Robert Wiblin: Yeah, I did a course in time series forecasting in my undergraduate economics degree, which I think is relatively unusual. And I think it’s perhaps an underestimated thing to study in terms of changing how you think about the world and changing how you think about making predictions about the future. Just realizing that these mechanistic autoregressive moving average models are actually can actually be extremely good predictors of the future and will often spit out answers that don’t seem entirely intuitive to you, but kick your ass. It’s quite interesting.

Philip Tetlock: Yeah. And you don’t even need the full ARIMA, the full scale thing. A very simple crude time series extrapolation ala Spyros Makridakis can perform pretty well. He’s very concerned about data over-fitting. And when you’re dealing with 10s of thousands of data sets, the advantages of the smoothing approaches become apparent, because the bumps average out.

Robert Wiblin: Yeah, maybe we can try to get a link to that anything … Was it Spyros you said?

Philip Tetlock: M-A-K-R-I-D-A-K-I-S. Yeah, he ran a competition, I think it’s in the … Oh, gosh, International Journal of Forecasting, there was a special issue devoted to his M4 competition.

Robert Wiblin: Wow, that sounds super interesting. I’ll try to check up a link for that, and maybe get him on the show at some point. What finding in your research do you think is most likely to be wrong?

Philip Tetlock: What’s most likely to fall apart? Well, I think when you hit historical discontinuity, I’m not sure there are superforecasters. So you’re saying what we most, most need them, they disappear on us. That’s not very useful professor. So I mean, historical discontinuities are so hard when there’s another … I’m going to sound like I’m a Marxist because I keep going to quote Karl Marx twice in this interview. But apparently, Karl Marx also said that, I’m not a Marxist by the way, “With the train of history, it’s a curve the intellectuals fall off.”

Robert Wiblin: Yeah. And probably said as anyone else as well.

Philip Tetlock: Well, it’s ironic given how often the Marxists have fallen in the 20th century, but it’s an all the more apropos remark. There’s a lot of truth to that, predicting change is hard and predicting dramatic change is really, really, really hard. So I think I would be doing a disservice to the world, if I implied, “Oh, all you need to do is have the superforecasters stand vigilant and they’ll be able to sound the alarm on everything.” They too will get things wrong. But you can fine tune these things. And you can say to them, “If there are some categories of risk that you’re really, really concerned about missing, you can do what the VCs do.” And you can lower your threshold, and you could say, “Look, we’re going to highlight certain things, even though they’re very low probabilities.”

Philip Tetlock: The way that the current forecasting tournaments work, forecasters are incentivized on picking up on events that have between five and 95%. There’s not much gain anymore from getting really, really refined, distinguishing between one in 100,000 and 1 in a billion. But yet, there’s a huge magnitude of difference there. And if the consequences are huge, it’s super, super huge.

Philip Tetlock: And then the question is, what techniques can we use? And that is the great challenge right now. We work on it with these Bayesian question clusters, we work on it with consistency and coherence checks. But we don’t have a solution to it. I think we all have to be aware that we live in a world that is subject to potentially radical volatility. I mean, you just look at the 20th century from decade to decade, and you’ll see that who predicted World War One in 1910 was virtually nobody. So certainly nobody was remotely close to how intense it would be.

Robert Wiblin: And if someone did, I would probably guess that gotten lucky to us. And that’s a tricky-

Philip Tetlock: Yeah. Well, if you have enough people making enough predictions, there will be a few. But with that, of course, A) virtually nobody was anticipating it. I mean, the idea of a great power war to some degree, but a war of that degree of lethality, really not. And I mean, they knew it’s an unstable situation, but they didn’t expect that level, they expected it to be over pretty fast.

Philip Tetlock: And then out of that, goes the Soviet Union, Nazi Germany and World War Two. And so it puts you on a path. Now, that doesn’t mean that there wouldn’t have been some communist regimes. It doesn’t mean that might not have been a rise of fascism in Germany at some point. And almost certainly nuclear weapons would have been discovered, regardless of whether there were those wars. But the timing and the context would have been probably very different.

Robert Wiblin: I was having dinner with Nick Beckstead last night. I’m not sure whether you know Nick? But he had this question for you, which hopefully, I can accurately represent. Because I think he thinks that you are relatively pessimistic, that even superforecasters are going to be able to do better than other people or do better than random chance once we’re talking about very long term forecasts over like decades, or I guess, possibly even centuries. And he thought that might be a mistake, because it seems like superforecasters using whatever styles of thinking they have, they’re not just good in one domain, where they have particular expertise, they seem to be just better at making forecasts almost across the board everywhere that we’ve checked. And so maybe even though we don’t yet have the data to show that they are able to forecast things more accurately over a very long time spans, that would be a pretty reasonable expectation to bring to the table. What do you think of that?

Philip Tetlock: Interesting. So it is true when you look at the tournaments within which the best forecasters excel, the questions are just incredibly heterogeneous. So they have to be quick studies to do this well. You’re moving from Arctic sea ice mass, to Colombian narco terrorism to Spanish German bond yield spreads, to Syrian civil war, to South China Sea Island building, to North Korean missile testing, it’s just everything under the map. And it literally it’s all over the map, is geographically and functionally extremely diverse. So you say, what fraction of people can have expertise in more than about one or two of these topics? And the answer is, nobody.

Philip Tetlock: They have to be pretty fast generalists. And I guess that’s the basis for the notion that, if they can display that degree of cognitive dexterity in the IARPA tournaments the original IARPA tournaments, where the questions were extremely heterogeneous and they weren’t well defined base rates for many of the questions, and you had to improvise a lot. If they could do that there, why couldn’t they do it going further into time? And I think it’s a matter of how quickly you think chance compounds over time. And I don’t have the exact answer to this.

Philip Tetlock: The magician statistician, Persi Diaconis at Stanford, once asked the question: “How many times do you have to shuffle a deck of cards before all information is lost?” So you have a deck of cards, open up a new deck of cards, all the cards are perfectly ordered from deuces up through aces. And they’re all in exact order, same order. And how many times you have to shuffle, I mean do a proper shuffle? I guess there’s a definition of what a proper shuffle is, and how many proper shuffles do you have to do before all order is lost? I think the answer is five or six (ed: It’s 7). Okay, now the necessary changes being made, ask the same question about history.

Philip Tetlock: I mean, there are things that are happening that are random and how much randomness, and you’re not getting full card shuffles every day or every month or every year. There are substantial pockets of stability in history. But how fast is the randomness compounding, so the optimal forecasting frontier is going to be very, very close to chance when you reach a certain point. And looking back on 20th century history, my guesstimate is people … There are certain categories of things, I think farsighted people were anticipating. But it took a long time even with physicists. New nuclear weapons really wasn’t on the radar screen until about 1930 or so. I think Einstein thought it was a nonstarter initially and then he changed his mind when Fermi got the reactor going.

Robert Wiblin: There was someone who I think anticipated it several years ahead and move to America and tried to sound the alarm about this risk. I think in the very early days of World War Two, or maybe even before that-

Philip Tetlock: Oh, yeah. It was Fermi and others were doing that.

Robert Wiblin: It was someone else I think who was … I can’t remember his name, and maybe I’ll chime in and say what it is. But there’s actually a book about this person’s efforts to try to prevent Germany from getting nuclear weapons first.

Philip Tetlock: So there were pockets of farsightedness. Although there you’re talking about a time frame of five years, the letter to Roosevelt that Einstein was coaxed into writing.

Robert Wiblin: I think it was someone who persuaded them to write that letter. Yeah, yeah. I can’t remember the name.

Robert Wiblin: The name that was escaping my memory here was Leó Szilárd, who persuaded Einstein to write a letter to Franklin Roosevelt about the possibility of nuclear weapons in August 1939, one month before Germany invaded Poland. It’s quite a remarkable story. We’ll put up a link to the Wikipedia article about it, and you can find out more in the biography, ‘Genius in the Shadows: the Man Behind the Bomb’.

Philip Tetlock: But that’s not really far, we are talking about centuries here. We’re talking about, well, some of the basic technologies in place now, the theory is there, the technology’s there and there’s a potential threat, so then it gets galvanized. But that bomb probably would have been developed anyway in a competitive nation states system, you could expect something like that to happen. But it might not have happened for another 20 or 30 years and similarly we might not have had a man on the moon until later. Or conceivably earlier, if Germany had taken a more peaceful course, Wernher von Braun had been sending them rockets.

Robert Wiblin: The Manhattan Project was a colossal effort. I think I remember reading that 1 million people were indirectly involved in the Manhattan Project, even though almost 99.9% of them didn’t realize what they were doing. But-

Philip Tetlock: It was the thing that only a power like the United States was capable of doing in World War Two, the other powers were exhausted and stretched and the United States had this incredible surplus capacity for waging war on two fronts. So it was remarkable asymmetry of power. Yeah, I guess he’s right. I mean, if you want to call me a pessimist, because I wouldn’t think they’re going to do a very good job a century out- a generation out. Now, when they get to five to 10 years, maybe there’s going to be some advantage, but it’s going to be increasingly small.

Robert Wiblin: Yeah, I guess it just depends on the nature of the question. Because if you’re saying, yeah, who’s going to be prime minister of the UK in 50 years time? I mean no superforecasters are going to get that, everyone is just back to chance. It’s just like guessing names at that point. But something like, which party will be in power, maybe you can get a little bit of resolution there.

Robert Wiblin: So for example, if you’re trying to forecast progress in artificial intelligence, like forecasting at what point do you get transformative change? Like at what point will the algorithms reach the point at which you can get transformative change, is very, very hard. But trying to potentially forecast just the amount of computational ability that we will have or how fast will computer chips be? Seems like potentially we can have something to say about that even looking 50 or 100 years out. Just because we have enough of a historical record and adjusting to the trends there. So it gets gets much harder, no doubt. But I think, superforecasters might be able to do better than just chimps throwing at dartboards.

Philip Tetlock: For some things, yes. I don’t know, is Moore’s Law still alive and kicking?

Robert Wiblin: It changed. But the thing is, I think it would not have been unreasonable to … I think actually people did predict because there was engineering and practical reasons why they people expected it to slow down, and indeed, it did. So there we did have some knowledge and expertise can help you to forecast. Sometimes trends hold, and that might be worth having a crack at.

Philip Tetlock: And maybe I’m naive, but I think when astronomers and astrophysicists tell me that the sun is going to go supernova in three or 4 billion years, I think they’re probably right. It’s going to come close to the Earth’s orbit, it’s going to destroy all life on the planet.

Robert Wiblin: Somethings are kind of mechanistic.

Philip Tetlock: And yeah, there are some categories of things, right? There are timescales and there are levels of determinism and certain operating laws where you have enough confidence that we think we can extrapolate out. I mean, where’s climate on that continuum?

Robert Wiblin: Somewhere in between. It’s safe to say they’re not very useful. Yeah. You have this chapter. Maybe it’s a multiple chapters in Expert Political Judg