Transcript

Robert Wiblin: Hi listeners, this is the 80,000 Hours Podcast, where each week we have an unusually in-depth conversation about one of the world’s most pressing problems and how you can use your career to solve it. I’m Rob Wiblin, Director of Research at 80,000 Hours.

If this show has at all changed what you plan to do with your career it would be a huge personal favour if you could let me know by filling out 80,000 Hours’ impact survey, which I’ll link to prominently in the show notes or you can fill out at 80000hours.org/impact-survey/.

It should only take about five minutes, and I’ll read every entry.

As the show progresses we’re going to do more deep dives into pretty specific issues. If you don’t have much interest in the topic of a particular episode, you shouldn’t feel any duty to listen to it. Much better to skip episodes you aren’t going to enjoy than get sick of the show and unsubscribe.

If an episode isn’t going to be useful to you, maybe you’re better off spending that time listening to an audiobook instead. I’ll put a link to some audiobooks that I feel have allowed me to have a bigger social impact in the show notes and the blog post associated with today’s show.

That said, this week’s episode covers a diverse range of topical issues, and so I suspect it will be of interest to most subscribers.

Here’s David Roodman.

Robert Wiblin: Today, I’m speaking with David Roodman. David studied theoretical maths at Harvard, graduating in 1990. He then spent nine years researching environmental policy at the Worldwatch Institute in DC, followed by eleven years researching development policy at the Center for Global Development.

He was briefly a Senior Economic Advisor at the Bill & Melinda Gates Foundation before becoming a Senior Advisor to GiveWell and then the Open Philanthropy Project in 2015. He’s done world-class reviews on topics as diverse as the risk of geomagnetic storms, the effect of incarceration on crime, whether deworming improves health and test scores, and the development impacts of microfinance.

So, thanks for coming on the podcast, David.

David Roodman: It’s great to be here.

Robert Wiblin: So, we planned to talk a lot about the methods you’ve used in your kind of careful analysis of these questions and what advice you’d give to people who would want to become researchers themselves. But first, how did you transition from studying theoretical maths back in the ’80s to the kind of tricky social science you do now?

David Roodman: I did it without a grand plan. I did not know exactly how it was going to work out. There was a lot of self-doubt and worry about direction along the way. I studied math in college because that’s what I was good at. That’s what I got the best grades in, and that’s what made me feel secure. I’m not sure that was really the best basis on which to choose one’s path, but I guess I needed that sense of security.

And I almost completely avoided thinking about what I would do once I left the ivory tower. I don’t recommend that to the listeners, necessarily. Because I was interested in English folk dancing, I wanted to go spend some time in England. As I left Harvard, I got a fellowship to study at Cambridge, where I also signed up for a one-year maths program, the Tripos, Part III. I figured I’d been studying math all these years, I can do it for one more, and it’s a way of punting on what I want to … figuring out what I want to do when I grow up.

What I didn’t expect was that once I arrived at Cambridge and came to terms with the fact that I really didn’t love mathematics enough to do it long-term. I think I wanted to do something more connected to the real world, as it was. Once I’d realized that, it was really hard to motivate myself to stick to my studies, because it now it just seemed so irrelevant, in fact, a barrier to my figuring out what suddenly was a very urgent question: “What do I want to do next?”

And so I became very interested in questions I hadn’t thought much about before. “How does the world work? What are the grand problems of our time, and what’s causing them?” And then, in a funny way, that was the macro question, but then there was a micro question of, “Where do I fit in?” And they felt linked somehow, you know, even though I couldn’t fully explain it. I lost interest in the classes. I remember one day I got a long letter from my girlfriend, who’s now my wife, just before a class- a lecture. I sat down, I read it and missed the lecture. I felt so good I never went to another one, another class, and started reading books that friends pointed me to. Then I found myself E. F. Schumacher’s Small Is Beautiful, books by John Kenneth Galbraith, Herman Daly and so on, which are all about broad questions of economics and ecology and ethics, and really exciting. But I didn’t know where that would lead me. I didn’t read those books and say, “Aha! I should go get a PhD in economics.”

Partly because I perceived economics … this was in 1990, as being very theoretical and mathematical in a way that didn’t impress me as a mathematician. And I guess I saw it as arrogant. More sure that its models were correct, even if they conflicted with reality. And so I felt passionate about these things but unsure of where it would lead me.

Robert Wiblin: So did you want to switch into more practical or applied questions for moral reasons? Or …

David Roodman: That’s a really good question. I mean, what I know is, I felt a need to be working on things that seemed more practical. I wanted to work on important things, which was of course distinct from how important my work would be. Whether that was out of a need for a certain kind of identity and self-esteem, or purpose, or morality, I’m not sure. I don’t think it was moral in the sense that I did an abstract analysis and determined that this is what a moral being should do in this position. But it was … I had a sense that I needed to move in a direction of more practical relevance.

And what wasn’t clear to me is how to link that with my aptitude for programming and mathematics. In fact, I didn’t assume that there was a link, and I only found that after 10 or 15 years into my career.

I followed my girlfriend, having … so I ended up actually failing my exams at Cambridge. I was required to take them so I sat for the minimum 20 minutes for each exam, and there’s one professor I kind of liked so I wrote him an apology, and couldn’t believe what I was doing. I’d been an overachiever all of my life. It was not … it’s a great story but I didn’t feel good about it all. I was actually kind of scared this was going to be a stain on my record. But I felt driven to do it.

So then after that, I followed my girlfriend back to Philadelphia, where she was in medical school, and started looking around for some job for an organization working on local environmental issues, without any clear sense of what I, a math major, would have to offer them. Eventually I found somebody who was unwise enough to hire me and had a great year working in a very small non-profit in a pretty rough part of the city, learning a lot about poverty there.

After a year, I … maybe this is a really interesting theme here. After a year, my girlfriend told me that I needed to move down to Washington, DC, to find something. And I reflect on it now, that’s interesting, because my general approach in life has been to figuring out … trying to figure out whether I’m thriving now or where I can go to thrive without a long-term plan. Whereas my girlfriend was saying, “You know what? If you want to be here in ten years, you’ve got to do this now.” Without that nudge from her, I might not have gone down to DC, which turned out to be a very good move.

I eventually did find the first job that you mentioned at the Worldwatch Institute, where I started out as a research assistant and moved up to a senior researcher, writing about various environmental issues. So, deforestation, energy policy, and so on.

Robert Wiblin: How did you manage to avoid kind of discovering the big picture questions before, I guess going to Cambridge at the age of 22?

David Roodman: That’s an interesting … I mean, I was exposed to them in elementary school, actually. I had a wonderful teacher early on. So they were always there in the back of my head. It could be that being at Harvard, I didn’t feel safe venturing out at a certain safe zone. I mean, you know, it was a liberal arts education. I studied a lot of different things. But it was too competitive for me to want to explore as much as I should have, probably.

It could also be some skepticism, as I mentioned, of economics in particular, as a field. Like I thought of it as a place mathematical and model-heavy and I didn’t yet appreciate how interesting the big questions are in economics.

Robert Wiblin: So, at Cambridge, were you having kind of an existential crisis and couldn’t motivate yourself to study? Or, it sounds almost as if you wanted to fail on some level, because that would force you to do something different. Force you in a different direction.

David Roodman: Well, on the plane out to San Francisco, just a few days ago, I was by chance looking at emails I’d written when I was at that period of my life. And I’d forgotten, there were a few days of, yeah, what you might call an existential crisis, which came not when I first went off the rails, but when I thought I needed to inform the fellowship committee that was financing me to be there, and out of a sense of integrity I wanted to tell them what I was doing, but also then question my integrity in not doing what I had promised to do. And that was very rough. I think it wasn’t that I wanted to fail as much as this was a life raft. This was somehow … this, I couldn’t explain it but this was what would lead me to find meaning in life.

Robert Wiblin: It sounds like your wife has been kind of important in your career. Is she working in the same area, or just a generally good advisor overall?

David Roodman: No, she became a doctor and … but didn’t practice much, and now she spent most of her time working first at a think tank on health policy and then working for Medicare and implementing part of Obamacare. And now she’s an executive at a big health insurance company.

Robert Wiblin: So, you’re someone who a lot of people trust to do kind of the most difficult empirical research, in effect [inaudible 00:07:57], where there’s either a lot of contentious evidence that has to be pulled together and reach a conclusion, or perhaps there’s very little evidence and so we need to get kind of as much juice out of it as we can. I think Holden has called you, “The gold standard for in-depth quantitative research.”

So how do you think you got to be that good?

David Roodman: Well, I won’t argue … and I won’t agree or disagree about whether I am that good. But I think that goes back even earlier in my life. Maybe I was just born with a certain sense of responsibility to the truth, as it were. But I am the child of a bitter divorce. My parents split when I was ten, and I grew up from that point on with the experience of there being these two gods in my life. They collided, and I couldn’t make sense of who was right and who was wrong when. But also felt a lot of fear about what happen if I chose sides? Alienating and losing a parent.

So I felt this very strong compulsion to go down the center, and if I ever strayed from the center, to be really well prepared to explain why I was doing it. And I think that actually drives my approach to researching. I’m really afraid of being wrong and so I always want to dig down the next level. And that’s part of how I’ve developed my style.

And this is a style that I feel like I’ve discovered. There was no grand plan, especially working for GiveWell and Open Philanthropy in the last three or four years. What I do that’s unusual is to review empirical research, mostly in the social sciences, and as much as possible, re-run the studies that I read for myself. I’ve hardly ever done new research, but I will go back to original data sources, try to understand the methods that were applied, re-do them, and then think critically about whether I agree with those methods or I want to apply alternatives.

And that arises both from my personality, as I already said, and also I think the fact that I don’t have formal training. I never did get a PhD, and I think probably people who come through PhD programs pick up a different kind of culture and maybe face different incentives, which discourages the kind of work I do. So I’ve kind of stumbled into this.

Robert Wiblin: How do you think it discourages it?

David Roodman: Well. Economics, or I’m sure other disciplines are … the field is a community of human beings, and that means it’s political. And especially if you’re a young person trying to make your way in the field, it can be dangerous to go around, you know, pissing people off by challenging their work, especially if they’re senior to you. Getting the good jobs is a very competitive process and I would imagine that people are risk-averse. And so the incentives are to do new research, you know, which might mean getting new data or thinking of new questions or developing new methods, rather than going around and challenging existing work.

Robert Wiblin: Do you think the political incentives cause people to believe the wrong thing, or just kind of act in a strategic way?

David Roodman: What do you mean by believe the wrong thing?

Robert Wiblin: I mean, are they successfully kidding themselves, or do they realize that there are these incentives, and … you know, maybe they believe that their advisor is wrong or using bad methods, but they just keep … they just bite their tongue.

David Roodman: Well, I think, especially the best people in these fields are very smart, and they see clearly, and no one knows better how the sausages are made than the sausage-makers. So they understand the problems.

I don’t think what I’ve just given you is the whole story. We need lots of people doing fresh, original research, and it’s … and maybe that’s where the best minds should be. But I think incentives are part of what’s going on.

Robert Wiblin: Do you think if you try to reduce the politics in an organization or a field, do you just kind of get different problems?

David Roodman: Oh, gosh.

Robert Wiblin: Is that something we should aim to do, or do we perhaps not realize the benefits that you get from politics?

David Roodman: Oh, gosh. I haven’t thought about it. My assumption would be that it is hard to change. That it’s an aspect of being human and it’s kind of wired in us. I don’t know. I haven’t thought about that.

Robert Wiblin: I think there’s a researcher, I think … Weingast? I’ll put up a link to some of this discussion. I think his view is that politics is a way of avoiding kind of outright violence, so you get these games that people play, but the alternative would be outright conflict, so … you shouldn’t only see the downsides of political processes and game-playing.

David Roodman: That sounds right to me. I mean, I said political behavior is human, and presumably it’s human because that has adaptive value, at least in our evolutionary history.

Robert Wiblin: So, do you think you’ve learned most of what you’ve learned just by trying to replicate these studies, and I guess in the process, you learn all of the methods that they’ve applied, and maybe even a little bit more?

David Roodman: That’s absolutely right. I have never taken a class in economics or statistics, but replicating existing work is actually a wonderful way to learn this stuff. It’s kind of like a scaffolding, and so when I did it for the first time, when I was at the Center for Global Development, which is a think tank in Washington that focuses on what … we used to call Third World Development. I was working under Bill Easterly, who’s now a pretty famous critic of foreign aid, among other things.

And he had me replicate what was then a very influential study on the impact of foreign aid on economic growth in the countries that receive it. And he used pretty elementary methods, I now understand, but they were totally new to me. But to have the paper on my left and the textbook on my right and the computer in the middle was a wonderful way to step-by-step learn the methods, and it’s been that way for me, throughout.

The one thing that might be more distinctively about me is that I’m a good coder, and there’s been several times when I have needed a program to run the methods that I was trying to copy. Wasn’t available to me, at least in Stata, which is a program that I’ve always used. So I wrote my own program, and through that learned more about some kind of family of methods, and then some of those programs have become popular.

Robert Wiblin: Do you think that most people can learn statistical methods this way, or are you just particularly smart or particularly well-suited to it?

David Roodman: What I think is that majoring in math actually worked out pretty well for me. Even though I didn’t do it for particularly good reasons, I think it might be a little bit like what they say about Latin, is that it teaches you some really useful skills in … you might call slow-mode thinking. Deliberative thinking. Which transfer to lots of areas.

I think that that is part of what happened for me. No, I don’t think that everybody would be able to do it as well as I have. Nor need they. By well, I mean, not everybody needs to skip formal training in order to do what I do. Go … if you know in advance that you love to do lots of replications, the skills you need can very easily be gotten say, through an economics PhD.

Robert Wiblin: So, when you were looking at the effect of foreign aid on economic growth, what kind of methods did you learn then? Just like, linear regression, I guess, but the other stuff as well?

David Roodman: That’s right, yeah. Most of the studies use straightforward linear regression with controls. That’s called ordinary least squares. Metaphorically, you’re just fitting a straight line to data. And then a lot of them also used what’s called instrumental variables. And there’s a simple form of that called two stage least squares. The idea is, you’re trying to find something … so we’re interested in the effect of foreign aid receipts on economic growth in receiving countries, but we’re worried that there could be reverse causation, for example, which would then mean that correlation doesn’t imply the kind of causation we’re interested in.

And so what you try to do is find a variable that could only affect economic growth via foreign aid receipts, like maybe the country happens to be a geopolitical … geopolitically important to the United States. So that might be Egypt, or Pakistan, or something like that. And as a result, gets unusual amounts of aid, and that constitutes a kind of natural experiment.

So the method there, instrumental variables, is a way of setting up that kind of experiment and only looking at the impact of the foreign aid that is explained by this deeper determinant.

Robert Wiblin: I guess you would have been doing these replications fairly slowly to begin with. Why would people put up with that, have you as a research assistant very gradually learning statistics and just replicating papers that already exist?

David Roodman: Oh. Well. It depends on what you mean by slow. I mean, this project, that first one, the paper was by Burnside and Dollar, it was published in 2000. I mean, we did it over a couple months. Part of my job was to build the data set, which I could do quickly. That was straightforward. Part of it worked out just because I was cheap. So it didn’t cost Bill much … Bill Easterly much to send me off and work on something and then not think about it for a while.

Robert Wiblin: Did the paper replicate?

David Roodman: No. And Bill kind of suspected that it wouldn’t. We … the sample was, I don’t know, 70 or 100 developing countries studied from about 1970 to 1993. When I went back and rebuilt the data set, I was able to add some more countries and also add more years of data, since time had passed. So when we re-ran the numbers with the expanded data set, the result just flipped, and now it seemed … the result was that foreign aid works, in the sense of increasing economic growth in countries that have good economic policies, which is a very influential result because it gave foreign aid agencies a recipe for effectiveness, seemingly.

We found when we added data that got a negative sign on the key term, and now it seemed as if giving more aid to countries with good economic policies actually slowed their economic growth. And we didn’t believe either one, really.

Robert Wiblin: So, we’re talking here about data replications, rather than experimental replications. You didn’t find another 70 or 100 countries to re-run the experiment. But what’s involved in doing a data replication?

David Roodman: What’s involved in data replication depends a lot on where the data come from. If they come from public sources, then it’s about downloading the data that you need, integrating it into a single file. I tend to use a relational database for that but you don’t have to. And inevitably when you actually get down to the fine details of doing that, questions arise. Ambiguities that you don’t appreciate until you try to copy something.

So then you usually have to send a set of questions to the original authors who may or may not answer, how helpful they feel like being [crosstalk 00:18:05]. How much they remember.

Robert Wiblin: May or may not feel enthusiastic about someone doing …

David Roodman: Or maybe they’ve … yeah. I’ve been told several times that, “That was a long time ago, the data are lost.” That kind of thing.

Now, increasingly I’m seeing studies … I guess I saw this especially in my work on criminal justice reform, looking at the impacts of incarceration on crime. Researchers are using much bigger data sets that are what we call administrative data. So, you know, a prison system or a school system. Lots of big government agencies are constantly aggregating data. Or, not aggregating it. Collecting data. And these … you know, at the student level, or the prisoner level.

And these may not be in the public domain but they will license it to researchers under restrictive conditions. And that can be much harder to obtain. We either have to go through the same licensing process, permission process. Or, in some cases, the original authors can pass on the data to us. They have permission to do that. But that can be a more … be a tougher thing.

Robert Wiblin: I was thinking with that question, most of the time if you’re trying to replicate a paper using more or less the same data, if you just run exactly the same analysis, then most of the time you get the same result. I guess they could have made a mathematical error or a coding error, in which case, that’s not true. But it sounds like you’re doing more than that. You’re also fiddling, or you’re changing the methodology a little bit. Seeing, does it hold up when you do it in your preferred way rather than their preferred way.

David Roodman: That’s right. There is a meme out there that research is not replicable. Meaning that when you do the study again, you just get a different answer. There’s a worrying study that was done in the field of psychology where they re-ran like 50 experiments or 100 experiments, which were all relatively small-scale, probably 50 to 100 subjects. And they just got different results a large percentage of the time. I don’t know the specifics.

And so there’s a concern that psychology research is simply not replicable. There have been a couple papers in economics arguing the same thing, although the one that comes to mind by some researchers at the Federal Reserve, counted a study as not replicable if when they contacted the authors, they never received the data after that.

I find in general, that if I go back to the original data sources and try to reconstruct a data study, I never get an exact match unless I have direct access to the researcher’s data and code. But it’s actually the exception for me to get a contradictory, a fundamentally contradictory result.

Most of the time I get something that’s close, and I say, “Yeah, looks like their analysis stands up on its own terms.” And so, the interesting stuff is then come to questions like, “Well, is this robust?” If we make small changes, does the result go away? Or there may be questions that are more specific to a given study. “This researcher says that there’s a fingerprint in the data of a particular intervention. Am I convinced that that fingerprint is really there when I test it in a way that is convincing to me?” That kind of thing.

Robert Wiblin: So, is it possible to generalize about what fraction of the time the basic findings hold up and what fraction of the time that they’re not convincing to you? And maybe at what point to do they fall by the wayside? Is it when you’re fiddling with specific analytical choices or something else?

David Roodman: I have found, at least in my work in the last few years for GiveWell and Open Philanthropy, that it’s been about 50% of the time that I reconstruct a study and end up still believing it. And it’s a pretty small sample. I have a tentative hypothesis that research is less reliable when it comes from a young researcher who is under very intense incentives to get that statistically significant result, which gets you into a good journal, which helps you on the tenure track. That’s a tentative theory.

Usually, the problems do come up, yes. After I have replicated the basic … successfully replicated the basic result and then I start to probe. And I’m thinking of cases in my head, and each one is kind of different. So I can give an example if you’d like, or can also ponder the ad hoc nature of most of my work, which I also worry a little bit about.

Robert Wiblin: Let’s do both of those.

David Roodman: Okay. One example. I did a review a couple year ago here, at Open Philanthropy, on the impacts of incarceration on crime. Does putting more people in prison reduce crime? Or maybe does it even increase it? We have an active grant-making program in criminal justice, so this is a kind of due diligence in parallel with the actual grant-making program.

And I found 20 or 30 studies that looked at different aspects of this question. Most taking place in the United States. And of those, I was able to … I focused on ones that were relatively recent, situated in the United States, and tied to the prison boom, the mass incarceration boom that we’ve seen in the last few decades. [inaudible 00:22:36] from studies of sentences of two days. That was less relevant.

And then among those, I was able to obtain data and code or reconstruct data and code for about eight of the studies. In about seven cases I had some significant critique that I came to, and about four had actually kind of flipped my interpretation.

One that I think is relatively easy to explain looked at the impact of a law that was passed in the early 90’s in California called Three Strikes. Three strikes and you’re out. And it had escalate … it was a repeat offender law. Your first felony would get a normal sentence. If you then committed another felony, the sentence would be doubled. And a third one would lead you to 25 years to life. Very draconian. The felony had to be of a certain seriousness.

But you had people committing what seemed like relatively minor crimes, you know, drug dealing, what have you, that met a certain threshold who were then in prison for at least 20 years. The study was by Alex Tabarrok, who’s at the Marginal Revolution Blog, and Eric Helland, and they looked at whether people who had two strikes and therefore were right at the edge, if they committed another felony, of getting 25 to life, committed less crime than people who had just one strike.

And they did it in a smart way in order to improve the quality of that experiment, the comparability of the two groups. They said, “Let’s look at people who have … let’s only look at people who have been charged with two offenses in sequence that could be considered strike-able offenses, that add to your strike count. And then let’s look at people who actually got two strikes but also look at people who, on one of those trials, the judge ultimately convicted them of a lesser crime that didn’t add to their strike count.” So you had people who had two strikes and people who almost had two strikes, but maybe got lucky.

Robert Wiblin: But very similar people, hopefully.

David Roodman: Right. That’s the idea. And they do a number of checks to see whether these two groups are fairly comparable, and then they look at the difference in the recidivism rate. You know, how quickly do these people get re-arrested once they get out of prison? And they find some deterrence. People who had two strikes and were facing 25 to life got re-arrested about 10% less per unit time.

And so they shared the data with me on request, and I was … rather by accident, I discovered what I thought was a problem that changed my reading. I wanted to do a cost-benefit analysis in my paper, as I did, which required splitting out this impact by crime type. How much did murder go down? How much … or arrests for murder? How much did violent crime go down, and so on. Because those have very different ramifications when you’re doing a cost-benefit analysis.

And I discovered that the impact was entirely confined to drug crimes. Or the seeming impact. And, so one thing I thought was, “It’s debatable, what exactly is the social cost of a drug crime.” Maybe costly for you, as a consumer, but one can argue about whether that once you can factor that in to a social-level cost-benefit analysis.

The other thing I found is that when I reconstructed one of the tables that just looked at whether the two comparison groups were similar, statistically. Had the same age, did they have the same number of prior offenses. My table didn’t quite match the published one. I actually found that the groups were systematically different on the number of prior offenses. And so what it turned out was that people who had more priors got re-arrested more after they were convicted of that second crime and were released.

So, it seemed like there may have been a failure of experimental design. In fact, the groups were not comparable, and so there was just a continuity over time. People who got arrested more before got arrested more after. And so that really reduced my faith in the power of draconian sentences to deter crime.

Robert Wiblin: Yeah, I know Alex. How did the authors of that paper react to what you were saying?

David Roodman: Alex has been, I would say magnanimous. And I think he may be unconvinced. I think his reaction was … his first reaction was, “Well we never claimed that this is a perfect experiment.” Which is absolutely true. You know, we just did a number of checks. But I don’t think I shook his faith. He did blog my report publicly and he just said, “I could argue with this, but let me put that aside and just welcome the larger project that this represents.”

Robert Wiblin: Something that affects my interpretation is that I know Alex is not a law and order kind of guy. I think, if anything, he’s in favor of much shorter sentencing. So the fact that he found evidence in … that would point in favor of having longer sentences … I don’t think he would have been biased in favor of that for political reasons.

David Roodman: Yeah, that sounds right. I think of him as a libertarian, I don’t know if that’s right.

Robert Wiblin: Yeah, I think he would identify as reasonably libertarian. Interesting.

Okay. So you looked at … you tried to estimate the impact of prisons through deterrence, incapacitation while people are in prison, and the effects that prison had on people after they left prison, and their likelihood of committing further crimes. What was the cleverest thing you think you did when you were trying to estimate these three different things?

David Roodman: Well, I think where my energy went was into reconstructing the eight studies that I was able to get data and code for. And each one had a different story. There was one that was just flat-out obviously true, which was such a clean, simple one. It was the study of a mass release in Italy. Showed that crime jumped the next month in certain categories. Just incontrovertible. But other ones I found a lot of problems.

Some cases it didn’t actually change my answer once I tried to deal as well as I could with those problems, but one interesting one that did actually lead me to read the study differently had to do with two studies that use the same data from the Georgia prison system. The American state.

The studies looked at the decisions that parole boards make. So the way things work, and I think still work in Georgia, is that a trial, or the judge gives you a certain sentence. Let’s say three years. You may not actually serve a full three years. A parole board, depending on various factors, may let you out early, in which case you serve out the rest of your sentence on parole.

When you’re on parole, it’s limited freedom. If you look at the history of the idea of parole, even the word, I think it goes back to the idea that you’re leaving on your word of honor that you’re going to be well-behaved. And if you cross any lines, if you fail to show up for an appointment with your parole officer, or you fail a drug test that might be required, or you get arrested, you can be yanked right back to prison without trial because it’s all conditional.

And this study looked at whether, kind of like in the three strikes study I told you before, it made a clever comparison. Not an actual experiment but a clever comparison that was as close as possible to an experiment. In order to look at whether people who are let out of prisons sooner subsequently committed more crime, or committed less crime. And it found that actually there was a big effect, I don’t know, I think each month that your sentence was shortened. Your actual time served was shortened, increased by something like three or four percentage points, the likelihood that you would return to prison within the next three years. That’s a pretty big effect for one month. You multiply it, say, by 12 months-

Robert Wiblin: Hold on. You’re saying if you spend less time in prison, you have a higher chance of going back to prison?

David Roodman: Exactly. Right. So this was seeming to say that keeping people in prison longer is reducing crime. Yes. And I should say that that was an uncomfortable conclusion for Open Phil, and so that possibly that motivated to dig into it more. Although, as I say, I think my bigger bias is I’m just a contrarian in general. And the study just seemed very clever.

But then I realized that there was actually another story to explain the result, which I call parole bias. And I’ll try to see if I can make this clear. If you imagine two people who are identical, they committed the same crime, they have the same sentence. Let’s say it’s three years. One is required by the parole board to serve his full sentence. The other is let out a year early, okay? And then you ask, you look at whether they got re-arrested within the three years after their release.

The person who got let out early is going to spend the first of those three years in the follow-up period on parole, which, as I just explained, is a period of very heightened probability for re-arrest. Okay?

Robert Wiblin: Oh, because you could easily get re-arrested, right?

David Roodman: The smallest infraction, yeah. Actually, re-arrest is not the correct term in this case. What she measured was return to prison.

Robert Wiblin: Re-incarceration.

David Roodman: Yeah. That’s important, actually. And, so if you just think about that, what that’s saying is if under the system, if the parole board lets you out earlier, more of your three year follow-up is going to be in this period of exposure to very easy return to prison. And that itself could make it look like being in prison less leads to higher recidivism. Which is a point that I don’t think had been made before, and it was a source of an alternative theory for … explain the results. Explain why it seemed like less prison time was actually increasing crime.

Robert Wiblin: Surprising that the researchers didn’t think of this possibility. To me, anyway.

David Roodman: Yeah. I can’t speak to that. I did … the author here is Ilyana Kuziemko, who was pretty helpful in helping me get the data and code, and we did have some back and forth on this.

Robert Wiblin: And did you manage to confirm whether that was the actual explanation? Or was it, what is that you now have two competing explanations?

David Roodman: I did, as well as I could, test whether that could explain it. It wasn’t an easy thing to parse out. So I couldn’t conclusively decide either way, and it gets pretty in the weeds pretty fast if I say more, but I did some initial variants of the main regression where I tried to deal with its effect. May have overcorrected with it. But the results were consistent with this being the cause of the headline findings.

Robert Wiblin: Any other clever tricks that you want to discuss from that?

David Roodman: Oh, gosh. There’s another study of deterrence. We talked about one. Three strikes. The study was by David Abrams. Looked at the effects of state legislature passing gun add-on laws. So the idea is, if you commit a burglary without a gun, the sentence is X. If it involved a gun, it’s X plus Y. So he looked at whether in the months or the years following the adoption of such a law, crimes involving guns suddenly dropped or not. And looked across many states at once.

I wouldn’t … I don’t know if it was particularly clever but I was able to re-think the study and, I think, add value by going back to the underlying data source. The crime data all come from the FBI. And you can go to the FBI website and download, you know, number of gun robberies or whatever in each state, in each year, pretty easily. But the raw data are actually supplied by what are called law enforcement- or local enforcement agencies. LEAs. Which are, you could be the New York City Police Department, or it could be a much smaller [inaudible 01:22:23].

And they report monthly. And it’s a system of reporting that goes back at least to the ’60s, or probably far longer. And the data are messy, but it means that you can get not only much higher resolution geographically, but much sharper information on timing. Month, to month, to month, to month.

And that matters, because the law may have … a new law may have passed in March, and you can see whether there was a change in crime rates in April, or six months later, rather than just looking year to year, which is fuzzier.

So with a lot of effort, I was able to download all the FBI monthly raw data and then pull it together. It’s messy data, so that took a big effort. And then there were a lot of missing data points or bad data points so I had to develop some algorithms for identifying bad or missing data, and using a pretty fancy technique called multiple imputation to make guesses about what data would go there in a way that would not introduce false certainty by making unknown numbers look known.

It was … big number crunching monster, but the end result was graphs that had monthly resolution rather than annual resolution, and the seeming drop that was coincidental with the adoption of laws kind of disappeared and it just looked like a very smooth decline.

Robert Wiblin: How much time do you spend just trying to get data and code that you need to try to do a replication?

David Roodman: Usually, the coding will take longer than the data. Especially if it’s computation-intensive. That slows things down. But I’m trying different things, and, you know. It’s a coding project, you have to develop algorithms and you run things a bunch of ways. But that was a big part of … I mean, the incarceration and crime report probably took a full-time equivalent year, and the majority of that was the reconstruction of these eight studies.

Robert Wiblin: How often, when you just email the authors, are they like, “Well, here’s a spreadsheet”?

David Roodman: That is the exception, unfortunately. More often I get the data because, either the primary data is in the public domain, or the publishing journal required that it be posted somewhere.

Robert Wiblin: Is it getting better?

David Roodman: More journals are requiring data sharing. And I think maybe younger researchers, some of them are more apt to share the data. It’s tricky though, because they can understand the principle but they’re also concerned about getting tenured. And if I’m asking to reconstruct their data set, I’m a risk to them. And that’s a pretty important thing to them.

Robert Wiblin: Do they see any upside from you approving what they wrote?

David Roodman: Yes, but I don’t think it justifies the downside.

Robert Wiblin: Yeah.

David Roodman: So, for example, I’ve had someone who say, “I will happily share the data and code with you after I get published.” And I just have to accept that. And that’s not the ideal from point of view of getting to the best knowledge as quickly as possible, but it’s also understandable.

Robert Wiblin: It’s the system we have. So, what did you end up concluding overall on the question of longer sentences or letting people out of prison would raise crime?

David Roodman: Right, well, as you said, there’s the before, during, and after effects of crime. The before effect is deterrence. Does having tougher sentences cause people not to commit crime? And I’ve just mentioned two studies to you of deterrence, one on the gun laws, one on three strikes in California. Both of those, I ended up not believing, and came to the conclusion that there really isn’t much deterrence at the kinds of margins that we’re talking about here in policy discussions. Obviously if there was no criminal justice system, there would be more crime. I think that’s pretty clear. But at the margin we’re at, I think it’s safe to say that deterrence is essentially zero.

Then there’s the during effect. Does crime fall significantly when you imprison more people? And the answer is yes. There’s several cases where … like, here in California, even. We’ve had two criminal justice reforms and they went into effect in 2011 and 2014. And certain acquisitive crimes like motor vehicle theft have gone up in a way that seems pretty clearly connected with those reforms. Much less with violent crime, though.

Then there are the aftereffects, and this is where it gets most complicated. Being in prison, you could imagine, could reduce the amount of crime you commit afterwards. Maybe you get new … there’s jobs training, or you learn to read, or you’re helped off of your drug problem, or you’re scared straight by the experience. But it’s also easy to imagine that being in prison just makes things worse, long run. That you’re more alienated from society, that you’re more close friends with other criminals and you learn your techniques. You have less ability to get a real job because you’re marked as a felon.

So that can swing either way, and it could dominate the overall answer. The majority of the studies that I looked at that are set in the modern American context, putting extra weight on the ones that I could actually replicate and came out believing, say that the aftereffects are harmful. So yes, you get a short-term benefit when you put more people in prison, reducing crime. But in the long run that seems to backfire. Increasing, actually increasing crime when you get out.

So as a very rough estimate, I would say, we’re at a margin in the United States today where … by the way, we have huge numbers of people in prison. You know. Per population, we’re the highest in the world except possibly North Korea. We’re at a margin where incarceration is not affecting crime. The marginal effect is zero.

And then we did a really interesting thing. This was prodded in part by Holden and his concerns that I was biased in the direction that was comfortable for us. He had me come up with a devil’s advocate reading, and then we also did a cost-benefit analysis using both my favored reading of the evidence and the devil’s advocate. The devil’s advocate says, “Actually, there is deterrence and actually the aftereffects usually are beneficial.” So, putting people in prison longer actually reduces crime.

And the cost-benefit analysis is a whole … its own world, you know? What’s the dollar value of a rape? And there are different methods that people have come up with. To answer that, and there’s some answers out there that are usually used and of course they’re highly debatable. But I found that if we take the devil’s advocate reading, which is to say that decarceration will increase crime, and we use the highest valuations on crime, which, again, is in favor of the devil’s advocate. So we really put a lot of dollar value on those … the cost of that extra crime. That it came out about break even.

Each person year of … well, if we’re going to talk about incarceration, then I would say, “Each person year of lost liberty was a cost of about $92,000.” That’s dominated by valuing a year of liberty at $50,000 and then also the cost of prison. And that it was averting about $92,000 in crime. The numbers came out exactly the same, which should not be … that’s false precision. So even in the least-favorable reading … the least-favorable cost-benefit valuation of the least favorable reading of the evidence, it would break even.

Robert Wiblin: I’m surprised you didn’t talk more about crime in prison, which I feel would really push things in the direction that it sounded like you wanted to go. Because, I mean, prisons are just hotbeds of crime.

David Roodman: That’s an excellent point. I think my … it absolutely does bear mentioning and I should have mentioned it. Yes, if you count the crime that goes on in prison, that would presumably shift the calculation a lot. Putting somebody behind bars may therefore, just almost on the surface, increase crime.

I’ve de-emphasized it for two reasons. One is, there’s not much data on it. So I just didn’t know what to do with it. The other, I think, is my intuition that the audience I most want to persuade may not care. You know?

Robert Wiblin: They view it as part of the punishment, perhaps?

David Roodman: Right. Yeah, that’s part of … you know, if you didn’t want to deal with that, you shouldn’t have gotten yourself in trouble. And somehow I have to agree with that to feel like the more effective argument for reaching across to skeptics, is to de-emphasize that. But you’re right. It absolutely bears emphasis.

Robert Wiblin: How did people take this conclusion, given that it’s a potentially politically-charged issue? Were people persuaded?

David Roodman: I think internally it was accepted. There was no controversy here because it happened to be compatible with what we’re doing. I think we’ve talked to some activist groups on criminal justice reform who have been excited about the findings and want to figure out how best to communicate it and use it when they’re talking to legislators. Haven’t gotten a lot of pushback from skeptics, so either they didn’t bother reading it, or they thought it was okay. Probably they didn’t bother reading it.

Robert Wiblin: What character traits do you think are most important in a researcher?

David Roodman: You know, there are different kinds of research, and research, like any other field, benefits from diversity. We shouldn’t all be the same way and optimize on the same traits. So, I can reflect on what has made me useful in my way, in my distinctive way. But I wouldn’t suggest that that’s what everybody should aspire to. It’s more about figuring out who you are and how you can contribute.

I feel a strong desire to get to the bottom of things in order to reduce the chance that I’m wrong. I have aptitude with quantitative things and coding, and those are all very useful. I’m interested in hearing and synthesizing different views, whether about methodology or much broader questions. So, I have sort of a pluralist instinct in that way.

But there are lots of great researchers who contribute by being less interested in what other people do and just pursuing their own genius with aggression.

Robert Wiblin: So, a couple years ago you worked at the Gates Foundation and then moved to the kind of GiveWell/Open Phil cluster that you’re helping now. How do you find that the two compare, given that the Gates Foundation is, I guess, has almost 60 times as many staff.

David Roodman: Well, maybe I should first explain how I ended up at Gates, because people may be interested in career moves. I was at the Center for Global Development for 11 years. When I joined there, it was a similar experience to earlier in my life. I knew I had some interests and some aptitude but really wasn’t sure how I would be useful. And Nancy Birdsall, the president, was kind enough to hire me and find ways to use me over time.

And it was there that I first discovered this interest in replication. But I think one thing that I lacked was that I’d never worked in a decision-making organization. Not an aid agency, or any other part of the government. Not a philanthropy, not a business.

And I think if you’re interested in policy-relevant work, like say, working at a think tank, it may be very productive to move back and forth between the think tank and a more practical setting. Because when you’re in the practical setting, you don’t have as much time to think but you encounter lots of questions that you wish somebody was figuring out. And then when you have some space to actually think and research, then you have the inspiration that comes from that really practical experience.

And I don’t think I had that. And so after 10, 11 years, I’d finished up my work on microfinance, and was having a lot of difficulty motivating myself around a new topic. I felt like I should be able to figure out what’s a valuable way to deploy my time, and I’m struggling. And … some point realized it was time for me to go. I was no longer growing.

And, decided I should work in a more decision-making institution. Went through a job search process and ended up at the Gates Foundation office in DC … where I lasted six months. I’m not ashamed to say that I was fired. And I probably shouldn’t go too much into what happened, it wasn’t like there was some dramatic story. But it clearly wasn’t a good fit. It’s a very big place. It’s like 1,200 employees I think? Last I heard. Plus a lot of contractors. And it’s giving away, I don’t know, four billion dollars a year? Something like that. Whereas I think Open Philanthropy might be up to $100 or maybe even $200 million.

Robert Wiblin: I think it’s about $200 million.

David Roodman: So it was giving away about 20 times as much, but with far more than … I don’t know, what would it be … far more than 20 times the staff.

Robert Wiblin: 60 times, or something.

David Roodman: This is a very lean place here. So it’s a large organization with hierarchy and various teams, and we talked about politics earlier. To do well there, you have to know how to work well in a very complex social structure. I don’t think I ever really learned that. I always lived in small organizations. And part of that is about understanding that speech can both be about getting to the truth, when you’re talking about some substantive topic.

But it can also have political implications. Maybe a disagreement won’t just be taken as a …

Robert Wiblin: Factual issue?

David Roodman: Factual issue. But it can be felt in another way. And I think I just wasn’t thriving there in the way that I needed to.

Robert Wiblin: Have you since kind of tried to learn those skills? Or are you just trying to find organizations where it doesn’t matter so much?

David Roodman: That’s a great question. I think mostly I have failed to improve in that way.

Robert Wiblin: [crosstalk 00:31:29] Do you think maybe that’s a virtue in a lot of cases? That if you learn how to do politics, then it would infect your research approach?

David Roodman: I think it is a virtue in my case, but it may be a luxury that I don’t have to think about it. You know, a broad thought that I have, having met many impressive people in DC over the years, is that people’s great strengths, or their strengths are also their great weaknesses. They’re often the same thing. It’s just whatever’s most distinctive about you is really useful in some contexts and really a problem in others, and we’re fortunate, people like you and me, and people listening, to have a lot of autonomy in life, and we’re not all just bound to be rice farmers.

And so we’re fortunate enough if [I 00:32:12] try to find our place in life, where what is distinctive about us is more often a strength than a weakness. So, I feel like I’ve had the luxury to not learn to be a very good politician, and that’s working out for me because I’ve managed to find places where it doesn’t matter as much, or it actually would be a detriment, even.

Robert Wiblin: I think that’s the question that’s come up at 80000 Hours, is how much to try improving on your weaknesses versus just moving to somewhere that weaknesses don’t matter, or they even look like strengths. And I think the research we’ve read suggests that people can change their character somewhat, although it’s quite a gradual process, so they can, over a decade, get rid of their weaknesses with quite a lot of effort. But it’s not easy and you can’t do that on too many things at a time. Whereas you can potentially move location quite fast to a place where, you know, having low conscientiousness maybe, or being a bit too outspoken are not such big problems, and your strengths can shine through. Do you have a view on that? Sounds like you’re in favor of the moving rather than changing.

David Roodman: Oh, I don’t know. I mean, what you just described sounds very plausible to me. And I hear it mostly as a potential source for self-criticism. In other words, I haven’t tried to improve myself that way. Maybe I could have, and maybe my reluctance to try to improve is a fault. It maybe reflects my stubbornness, right? My resistance to my feedback from others. So, it sounds like, I mean … what is implied, what you’re saying, is one should do both, I think. Lifelong growth is a great thing, and especially, one gets older, there’s a strong temptation not to push yourself to change. It seems good to resist that. But you also have to approach yourself with humility and recognize you can only push yourself to change so fast and in the meantime, if there are cheaper ways of finding ways to [inaudible 00:34:01] being happy, you know, go for it.

Robert Wiblin: Alright, let’s talk a bit more about your research approach. So, before you’ve even chosen what question to research, I mean, how do you figure that out? Do you often kind of turn down projects because you don’t think you’ll be able to make a good go of it?

David Roodman: That’s happened occasionally that I’ve turned something down because I just don’t feel like I can contribute much. But what I like about, really like about being at GiveWell and Open Phil is that people come to me with questions that have practical relevance for decisions that are being made, or, you know, conceivably could be reversed.

Robert Wiblin: Right.

David Roodman: And that in itself is inspiration. I talked to you about how I’ve lacked that kind of inspiration that comes from practical experience, where here I get a sense I get it. A topic like the impact of incarceration on crime looked really boring to me. But once that I knew it actually mattered for things we were doing with real money, that was the motivation to get into it. And almost anything is interesting once you get into it.

And I should say, I don’t see myself as somebody who can only do research reviews. That’s what I’ve done here, out of a kind of comparative advantage-type argument. But at the moment actually I’ve pivoted to something new that is very raw in my mind, so I probably can’t speak about it very clearly, which is to participate in the internal discussions that we’re having about what we call cause prioritization. The philosophical issues that come up when you think about how much to put into animal welfare, versus taking care of people. And how much to worry about problems today versus the far future. And that’s not nearly as quantitative a set of questions, but I still feel that I can contribute in the same spirit of seeking out lots of different views and trying to synthesize and think critically.

Robert Wiblin: When you start a new project, can you kind of walk us through the process that you’ll go through? Do you start by trying to collect the data, or do you read a lot, broadly, about the topic to kind of situate yourself in it?

David Roodman: It’s a very organic and ad hoc search process, usually. So, typically when somebody comes to me with a new topic, they’ll say, “We read this.” Or, “We’ve got this paper we think you should look at.” And that’s enough to start exploring the network. You read that, it cites other sources. In some cases you Google authors’ names, or even talk to them. So, it’s not very structured, really. And I sometimes worry about that.

I think … my most recent replication was of a paper on the impacts of the deworming campaign in the American South, about a hundred years ago. And that one … actually, and then, it was done by an economist named Hoyt Bleakley. And he did a companion paper using similar methods and some of the same data, looking at the impact of malaria eradication.

And especially in the companion paper- I replicated both. I made an effort, for the first time, to pre-register. To say, “Here’s what I intend to do,” and then put that on a third-party website that could prove when I had submitted the document. And that was out of a sense that I need to start becoming more conscious of what I’m actually doing, and then there are potential for biases to creep in if I don’t do that. But to date, it’s been pretty informal exploration most of the time.

Robert Wiblin: What kind of biases would you worry about?

David Roodman: Well, this came up … this was some good feedback I got from Holden, Holden Karnofsky, who’s the director at Open Phil and is my boss, when I was working on the impacts of incarceration on crime. He wanted to know, how did I choose which studies to really dig into and try to replicate? Because he worried about bias. And he was mostly worried about bias of the kind that would make my results comfortable. He wanted me to make sure that I was doing everything I could to make us uncomfortable, because that’s where the value of this work comes from.

And I realized I didn’t have a great answer for him, and that he was probably right. There were certain studies that I was more skeptical of because they came to a conclusion that would challenge Open Phil and its priorities, and so I was more apt to dig into those. Although, to be honest, I think a bigger bias I have is against anybody who claims to have a really statistically significant and large result from non-experimental data, especially. I’m just sort of a contrarian. I think I’m more of a contrarian than biased one way or another on a lot of these issues.

But I realized I didn’t have a complete good answer for him, so then when we were revising the document, I made a second round of attempts to get data and code in a more systematic way.

Robert Wiblin: You mean to choose which papers to scrutinize more at random?

David Roodman: Yeah, well, you know, I had already replicated, I don’t know, six or eight of them, and I then wrote to authors of other papers that were in a certain sampling frame that I described earlier. You know, set in the US, not too long ago, focusing on margins of punishment that are relevant for the mass incarceration debate. Not, you know, many years, not many days. Which ended up not yielding any more studies, as it happens, but it was an education for me that I need to be better able to explain the path that I’m choosing.

Robert Wiblin: Do you start writing early, or do you kind of spend a lot of time playing with the data before you put pen to paper?

David Roodman: I would say I do not spend writing … start writing early. I think it can be a good discipline. Worked with a guy years ago named Alan Durning who’s at … founded The Sightline Institute, which actually I think we fund, in Seattle. And he always said, “The first thing you should do when you’re embarking on a long project that could take a year of writing, is write the press release.” And probably once you get to the end of the project and you’re actually ready to launch it, you’ll completely trash that press release, but it can be really good for helping you focus on what bottom line. What are the key questions that you’re trying to get at?

Robert Wiblin: How do you know when to stop?

David Roodman: I think it’s synonymous, in a way, with the question of, “How do I make judgments?” Because once I reach a judgment, then I feel like it’s okay to stop. That doesn’t mean I can’t learn more, but it’s an important turning point. Actually, I’m thinking aloud here. I’m not even sure I believe what I just said, because in a lot of cases, you just have to make the best call you can with the data you have. You always have to do that. It may not be as much as you’d like.

I’m not sure. I think we all go through processes of trying to figure things out, and at some point we get to a sense … a point where we have a settled understanding. It may subsequently evolve, but it’s a mature thing that’s ready to be shared with the world.

Robert Wiblin: Has anyone written a guide to doing what you do, or is it somewhat distinctive?

David Roodman: I’m not aware of anything like that. We just had a very informal meeting here at Open Phil where I was asked to speak for a few minutes on what I look for when I’m reading research, and I struggle with it because a lot of what I end up doing with the study is very specific to that study. There are some principles, but I have not seen them written up in any way.

Robert Wiblin: Okay. So if it’s hard to generalize, we should dig into some specific analysis that you’ve done, I guess. Figure out what methods you used in each one. Let’s tackle the geomagnetic storms topic first, which I found particularly interesting. So, in 2015, and then again in 2017, you looked at the risk of geomagnetic storms messing with the electrical grid and, I guess, other electrical equipment. So, yeah, what approach did you take there? And feel free to go into as much detail as you want.

David Roodman: So, a lot of our interest at Open Phil is in existential risks, and there are many of them, as I’m sure many the listeners already know. A few years ago … actually, when I was working as a consultant, before I became an employee, I was asked to dig into this one question of geomagnetic storms. What happens is, you know, and this is obvious I’m not a physicist. I’m a statistician more than anything else. So I don’t understand a lot of what I’m about to describe.

There are these big cataclysms on the Sun. And, they cause the ejection of coronal matter, which then gets hurled away from the Sun and might collide with the Earth. It is typically magnetically charged, that is, the particles are systematically magnetically-oriented, and so it’s like this little magnet coming and clobbering the Earth.

And smaller versions of this are what cause the Aurora Borealis. And, as with the Aurora Borealis, and I guess the southern one is called the Aurora Australis? You told me-

Robert Wiblin: I think that’s right.

David Roodman: The Earth’s magnetic field actually channels the material towards the poles, and so at high latitudes is where you get the impact, and it’s a little bit like, you know, if you do a cannonball or you know, you drop something huge into the water. It creates a lot of turbulence disturbance. What happens is, that the local magnetic field, especially at high latitudes, will start to oscillate in kind of random but high amplitude waves.

Changing magnetic fields, in turn, induce electrical currents in any wires that happen to be nearby. One scenario that people have been worried about is that a really big storm could induce really large currents in long-distance power lines, which would fry the transformers that are at either end of these. These are what change the voltage.

A dam might produce power, I don’t know what, a hundred volts or whatever. And then that gets stepped up to a much higher voltage like 765,000 volts or even a million volts for long-distance transmission because that reduces the energy loss. And then there’s a transformer at the receiving end which then steps down the voltage again. The little boxes that you use to charge your phones and computers, those are transformers. They’re converting the oscillating current in the wall into the direct current that your computers and phones need.

But there are also transformers that are as big as houses, and these could get fried. These could just get destroyed in seconds, maybe. Or minutes. Which would then cause blackouts. And the worry is that this could happen over a very large area, continental-scale area, and then we would have to replace hundreds of these giant transformers. In the meantime, there would be large-scale blackouts lasting for months. There are not a lot of these … not a lot of spares around. These are custom-built, huge things. And they can take months, each, to manufacture. A long-term, large area blackout could be an economic and humanitarian crisis. Because if you don’t have power, then you can’t … maybe your pipeline shut down, maybe the hospitals don’t work, et cetera.

Robert Wiblin: Can’t move food around.

David Roodman: Can’t move food around. Yeah.

Robert Wiblin: Can’t store food.

David Roodman: Right. All these systems that depend on each other and power could collapse. So it’s pretty scary. And there’s a couple authors, who, in particular, their work on this possibility have been cited widely in the press, and were getting attention.

I dug into it and did my best to understand the physics and the astronomy. And then gravitated to a statistical aspect of the question, because that’s where I could make the biggest contribution. What I looked at was what the history of these events allows us to say about the probability of more in the future.

So, there are different ways of measuring the magnitude of a geomagnetic storm, and depending on how you measure it, the data are available for 20 years or 50 years. One measure is, complicated reasons, when a storm hits, it actually reduces the strength of the Earth’s magnetic field at the equator. So you can collect, minute by minute, magnetic data at two magnetic observatories. And that is done, and there’s actually four observatories that are used to construct this particular index, which is called the Disturbance Storm-Time Index. Dst Index. And that thing gives you a number to represent geomagnetic disturbances.

We have that going back to 1957, and we can model that and ask, “Well, what is the probability of there being a storm in the next decade, of a certain magnitude or higher?” The example that everybody worries about was a big storm that hit in 1859. Now, of course, in 1859, there wasn’t much power structure to worry about. Apparently, a few telegraph operators were electrocuted, and there were spectacular auroras quite far towards the equator in both hemispheres. People wonder what would happen if we had a storm that big again today.

I came away with a kind of paradoxical message. I think that the people whose work got the most attention were exaggerating the risk. If you just do the analysis, and I can explain in specifics. They were overestimating. I should qualify that. It’s not that they were exaggerating the risk, because the risk is unknown. But their extrapolations from history were not well done, and were overshooting.

On the other hand, there’s so much we don’t know. This is a pretty under-researched area. And this is an area where we actually can learn more. Economists often struggle to figure out whether inflation causes growth or the other way around, and it’s just sort of a thing that goes on forever. They can never figure it out.

But with money, we could figure out more about how these kinds of storms affect actual transformers. That’s an actual research program that just can be done. So there’s a real opportunity to learn more and reduce our uncertainty.

I came to be persuaded that this big storm in 1859, called the Carrington Event, was probably only at most twice as big as storms that have occurred say, since 1950. Which doesn’t sound that scary. The biggest event we’ve had since 1950 was 1989. March of 1989, there was a storm that caused a blackout in much of Quebec. It destroyed a couple transformers. But, within about 12 hours, the power was restored. It was hardly a catastrophic event.

So, 12 hours of power loss in one part of Canada. If we then double storm strength, should we expect a totally different level of impact? I would cautiously say that that seems unlikely.

Another confusion was that, because there’s a lot of turbulence … if you imagine looking at the ocean during a storm, you can see ordinary waves. But then if you look at any individual wave, you’ll see smaller ripples, and so on. It’s kind of fractal. I think that’s a good visual metaphor for what happens one of these storms hits. There’s a huge amount of local spiking, and it’s very tempting to say, “Well, the largest spike in magnetic field that we saw anywhere on the Earth that was measured is X. So now let’s assume that a storm could cause X everywhere at once.” And that’s not an appropriate extrapolation. It’s like assuming … imagining that every place is as high as Mount Everest. But that kind of fallacy was embedded in some of the most scary analysis.

My overall take was if we extrapolate from the historical record, which is short and shouldn’t be over-relied upon, that the chance of an event as big as the Carrington Event of 1859 recurring was about 0% to 4% per decade.

One caveat I would … and as I say, that event in itself didn’t seem … doesn’t sound to be so scary because it’s twice as big as events that civilization shrugged off very easily. I think the biggest caveat is that I learned about some research just as I was finishing up that looked at tree rings from trees in Japan, I think, and found that there was a very sharp jump … I forget. I think it was probably in an isotope of carbon dioxide, but I’m not 100% sure, in the tree rings between a couple years in the 700s … going by the Western calendar. And again in the 900s. It’s more than a thousand years ago.

And the best explanation for those giant jumps is apparently extraterrestrial radiation, conceivably from another galaxy or another star, but probably more likely our own Sun. And this would imply a solar flare, I don’t know, ten or twenty times bigger than anything we’ve witnessed in modern history. But solar flares are not the same thing as geomagnetic storms. Solar flares are huge outputs of pure radiation. They may be associated with ejections of actual coronal matter, which is what we’re concerned about. But the association is not well understood, and in particular it’s not clear whether a solar flare ten times as big as anything we’ve witnessed in modernity would lead to geomagnetic storms ten times as big.

Robert Wiblin: I just want to understand the engineering aspect. Is it that the transmission lines are very long, so that they kind of pick up a lot of the magnetic change? And so the longer the cable, the worse it is?

David Roodman: That’s right. Changing magnetic fields. If you’re in a particular spot and the magnetic field around you is varying, that induces a voltage where you are. This is one of the principles that makes motors work, and generators. We measure electric fields in volts per meter, or volts per kilometer. So if you have a wire that is running many, many kilometers and it’s immersed in this very strong electric field, then yes, that will multiply the effect and induce a larger current.

Your question about the engineering, though, reminds me of another interesting theme in this. One voice in this discussion that is some engineers at ABB. ASEA Brown Boveri, I think it is, which makes a lot of these big transformers, and I think because of mergers, is now sort of retroactively the maker of the majority of transformers in the United States. So engineers there have put out papers saying there’s really nothing to worry about. And we have to take what they say with a grain of salt, because they’re basically saying, “Our products are great.”

Robert Wiblin: Wouldn’t they want to say, “Our products will break. You’ll need spares.”

David Roodman: Yeah, you’d think that. But I guess it cuts both ways. I don’t know. I remember testing that idea. I don’t know, I think maybe if they say that-

Robert Wiblin: Everyone’s transformers are bad.

David Roodman: Yeah. If you admit your products are bad, then obviously then that may push people to go to the competitor. [inaudible 00:51:20] competitor keeps its lips tight.

So anyway, we need to take what they say with a grain of salt, especially because they’re using models that they won’t share. So they make claims that, “We’ve done simulations and everything’s fine.” But they won’t really let anybody else check that.

Nevertheless, I found there was an interesting argument in what they said, that I couldn’t dismiss on principle. Electrical power grids have all sorts of components that are designed to regulate the waveform of the power, the exact frequency, keep the waveforms from different generators in sync. All sorts of machinery to keep this very complicated thing working just right. It’s really extraordinary precision over large areas. Or, if they can’t do that, to shut the system down.

So what they’re saying is that if there’s a big geomagnetic storm, it has two main effects. One is that it almost instantaneously starts disrupting the flow of current, in the waveform of the current. The other is that, over the scale of say, 20 minutes, it will start to pour energy into transformers and heat them up, causing damage. But those are two different time scales. You know, milliseconds and minutes. And they’re saying that there’s a lot of safety equipment in place that will automatically shut the grid down if things get too disrupted.

And so the result could be a very large and quick blackout. But it actually protects the system. So, with short-term fragility comes long-term resilience. So what we may have to actually wore about more is not the really massive events, but smaller events that damage transformers but don’t disrupt the power flows enough to trigger the safety mechanisms. And there’s some actual evidence that that’s happened, for example, in South Africa, which is closer to the equator and therefore doesn’t experience the storms as strongly.

So that, if there were a big storm, maybe the longest-term damage would actually be in places that are closer to the equator, less affected by it. And what would happen is, over, say, the next year or so, a lot of their transformers would shut down. Which, to me doesn’t sound like an existential risk.

Robert Wiblin: So is it the entire transformer that gets broken? Or is it just some piece that we can stockpile and then replace later on?

David Roodman: Well, now you’re pushing … I don’t really know that well. My intuition is, the damages can be pretty severe.

Robert Wiblin: Is it is an explosion? It’s getting hot and breaking?

David Roodman: Apparently, there have been explosions. Yeah. You know, the key components of a transformer are a magnetic core, and then lots of lots of wire wrapped around it. And then it’s immersed, usually in oil, for cooling purposes. And so when these things overheat, you can … I suppose the cores, the actual magnetic cores are not that harmed. But all the wires, their insulation could get burned, the wires could melt together, the oil can catch fire or absorb impurities. Sounds like a pretty big repair job.

Robert Wiblin: So what was the biggest challenge there with this research project?

David Roodman: The interdisciplinary nature of it. I studied a bit of … about electronics when I was a kid, so I had some background and some understanding of how electricity and magnetism are connected together. But I was certainly out of my depth and going beyond those basics, or trying to understand solar physics, or what have you.

And so it was trying to understand enough of the literature that I could say the kinds of … make the kinds of summary statements I’m making to you. Or even engage in conversations with the guys who actually knew this stuff. Which I did. Even that, you need a certain level of understanding.

Robert Wiblin: Did you have to learn or maybe even invent some new methods to reach a good conclusion?

David Roodman: No, I didn’t invent anything new, although it was another case of my wanting to implement a particular method to do the statistical extrapolation that wasn’t easily done in Stata, which is the statistics software that I know. So I ended up writing a program and going beyond just what I needed and writing a general purpose program that is now shared and ultimately getting very immersed in one particular technical question about how to construct confidence intervals, which led to an obscure and separate academic paper.

Robert Wiblin: So, I thought you’d say that the biggest challenge was that there just aren’t many historical events so it’s hard to know what’s the likelihood of an extreme event going forward when we just haven’t … we’re trying to predict the frequency of something that’s never happened, or at least not the last few hundred years. How did you get around that issue?

David Roodman: I would say that I did not get around it. I did the best with the modern observatory data that’s, you know, observed every hour and provides the basis for good statistical analysis. And then I zoomed out and acknowledged that there are longer-term dynamics which remind us that there’s a lot that we don’t understand and things could change more than the brief historical record that we have … would suggest.

You know, the question ultimately is, “What should we do?” That’s the important question. We don’t have to have complete understanding of the underlying … the physical reality in order to come up with a good answer for that question. My answer was that we shouldn’t panic, that there is some exaggeration here, but despite my more reassuring conclusions based on limited data, we can’t rule out some serious tail risk. And there’s a real opportunity here to improve our knowledge.

Robert Wiblin: Would it be expensive?

David Roodman: Compared to the stakes, no. Whether Open Phil would consider it to be expensive, maybe. It’s not cheap. What you really want to do to understand better how transformers are affected by these storms, is you want to have an actual, full-size transformer supplying, shall we say, a small city, and then being inundated with these additional large and volatile currents. But there aren’t a lot of spare small cities around, you know. It can easily, you know, I would imagine, run into tens of millions of dollars to run realistic field experiments.

Now, if you’re concerned about the fate of the global economy, that’s nothing. The question [crosstalk 00:57:03] might be daunting even for Open Philanthropy.

Robert Wiblin: But who’s going to pay for it?

So you mentioned, kind of, the fat tail-ness of the distribution. I guess we have a reasonable sense of the frequency of common, probably small geomagnetic storms. Can we then kind of extrapolate? Just say, “Well, it’s not going to be a normal distribution but it’ll be a power law or something like that,” and from that we can figure out the frequency of something that’s never happened before?

David Roodman: Yeah, that’s an idea that I develop in my report. Probably a lot of your listeners are familiar with at least the rough idea of the central limit theorem in statistics. This is a really key result. It says that, for example, if you were to conduct the same presidential poll, the same moment in time. Maybe you did that poll a thousand times. You would get a slightly different answer each time, each run of the poll, but your answers would cluster around the true value and they would do so in a pattern that follows a bell curve. That’s also called the normal curve.

And that’s true regardless of the actual underlying distribution of views in the world. Almost every case that we can imagine, you get a bell curve when you repeatedly sample. And that’s a really powerful result because it means you can start to construct confidence intervals while remaining ignorant of the underlying distributions of the things you’re studying.

We can do something similar when we’re looking at extreme events. It turns out that, you know, we don’t know what the true statistical distribution of geomagnetic storms is. Some people have argued that it’s kind of a power law, something else. Turns out that, when you look at the tail of the distribution, you know, the way it’s sort of gradually coming down to zero and flattening out. Most tails are the same. That is to say, they fall within a single family of distributions. It’s called the generalized Pareto family. They vary in, you know, whether they actually hit zero or not, and how fast they decay towards zero. But they kind of look the same, and regardless of what the rest of the distribution looks like.

So what you can do is you can take a data set like, all geomagnetic disturbances since 1957, and then look at the [inaudible 00:59:09] say, 300 biggest ones. What’s the right tail of the distribution? And then ask which member of the generalized Pareto family fits that data the best? And then once you’ve got a curve that you know … you know for theoretical reasons is a good choice, you can extrapolate it farther to the right and say, “What’s a million year storm look like?”

And one also has to be careful about out of sample extrapolations. But I think it’s more grounded in theory, this is, to use the generalized Pareto family, because it is analogous to using the normal family when constructing usual standard errors. Than, to, for example, assume that geomagnetic storms follow a power law, which was done in one of the papers that reached the popular press. So there was a Washington Post story some years ago that said the chance of a Carrington-size storm was like 12% per decade. But that was assuming a power law, which has a very fat tail. When I looked at the data, I just felt that that … and allowed the data to choose within a larger and theoretically motivated family. It did not, the model fit did not gravitate towards the power law.

Robert Wiblin: This kind of log-normal or normal curve, or power law, are they all special cases of this generalized family?

David Roodman: Their tails are.

Robert Wiblin: Okay. Tails.

David Roodman: Like if you were, you know, if you take the right-most 1% of the right-most tenth of a percent, they will more and more closely approximate a member of this one particular family.

Robert Wiblin: How much do you rely on interviews with experts in your research in general?

David Roodman: I don’t rely on them as much as, I would say my colleagues do when doing other work that is published at Open Phil. There’s a lot of interviews that are done here, and the notes are printed up and so on. But I do very much value when I get to a certain point and I think I’ve got a new understanding of some question, but I’m not confident in it yet because I’m new to the field and the ideas are new to me. I love being able, at that point, to call up an author and test my understanding. And very often, you know, my understanding gets reversed or I get pointed in new directions.

Robert Wiblin: Why don’t you think Open Phil has given many grants to deal with geomagnetic storms?

David Roodman: I should know the answer to that. I think the decision was made when I was still on a consulting basis here, and so I was on the outside. And so I’m not sure. We have made one grant to a researcher whose work I mentioned before, in South Africa. But that was not part of a systematic effort to take on this area.

I think probably people became convinced that other existential risks looked bigger. We’re doing a lot of work on, you know, pandemic preparedness and bioterrorism preparedness, and also we’re looking at AI safety, a couple other areas.

Robert Wiblin: I guess one thing is that a geomagnetic storm wouldn’t affect the whole globe all at once, right? It would just affect some part of [inaudible 01:01:51].

David Roodman: That’s a good point, yeah. It doesn’t literally seem to represent an existential risk. Certainly a catastrophic risk, but.

Robert Wiblin: Are there any careers you’d like to encourage anyone to go and work on in this area?

David Roodman: Well, if you have an aptitude for engineering, yeah, we definitely need more research because I just think there isn’t much attention being paid to this, and the stakes are potentially quite large.

Robert Wiblin: All right. Let’s move on from the geomagnetic storms to talking about research you did on the impact of deworming. Trying to figure out whether it really does improve child health and test scores and things like that. That’s been quite a source of controversy, as informally known as the Worm Wars. What did your analysis add to all that?

David Roodman: Over the course of, I think, well, a year and a half or so, I replicated and reconstructed most of the studies that look at the long-term impacts of deworming. So we’re talking about distributing pills, primarily in schools, to all kids, whether or not they have worms, because it’s just cheaper to give them and we believe the side effects are essentially zero, without actually testing whether they’ve got worms, and doing that, say, twice a year in areas where worms are endemic.

There are lots of studies of the short-term impacts on body weight, height, these kinds of things, within, say, six to twelve months. But many fewer of the long-term impacts. But for our cost-benefit analysis and thinking about whether to recommend deworming charities, of course, the long-term matters a lot. Effects over ten years are ten times as important as effects over one. And there, we’ve only got four or five studies.

So I’ve looked at most of those. Not all, yet. The big story is that I have undercut a couple of studies, but not the key one that has brought the most attention to this intervention and that we have been using in our cost-effectiveness analysis.

So 20 years ago, Ed Miguel and Michael Kremer co-authored a paper called Worms. They were economists, so it was in the economic literature, which was based on an experiment run in Western Kenya, of deworming. And they looked, initially, at short-term impacts, as you might expect, and they found that the dewormed kids went to school more. I don’t think their test scores improved, but school attendance jumped. And then they got more funding to follow up longer-term on the same kids, and they’re continuing to do that. I think we’re providing some funding for that.

And that’s really fantastic, just be able to see the effects 10, 15 years out. And so one of the longer-term studies, I think, it goes out about maybe not quite 10 years, has been the key one in our cost-effectiveness analysis that you can find on our website. So it’s follow-up on the original experiment and the key question is, can we trust the original experiment? Was it a proper, clean experiment? And I tried really hard to take it apart, to attack it. But in the end that I had to concede that the study won. One concern was that it wasn’t actually a randomized study. There were 75 schools in the study and they sorted them by, I think, province and then district, and then by the number of kids in the school.

So they had this sorted spreadsheet in Excel. And then having sorted them, they numbered them. We were actually three … it wasn’t a two-way experiment, it was a three-way experiment. Some kids immediately got deworming, some kids it was delayed a year and some kids it was delayed until the experiment was over. So then they numbered the list: one, two, three, one, two, three, right down. And that was how they assigned the groups. So they were assigning in part on how many kids were in school. Which is a little bit worrisome, and it wasn’t randomization.

So I tried to look at, is there any kind of what’s called statistical imbalance. Do these groups look statistically different? And I even went so far as to figure out exactly where these schools were, using some data that they accidentally made it viewable on the internet and didn’t want me to see, and which I’ve kept confidential. Figured out their exact locations, and then used that along with Google Maps to figure out their elevations, which is actually important because elevation has a lot to do with how much … how bad the worm problem is where you are. Higher elevations are going to have less of it because they don’t flood as much, basically.

So I had a new variable that was external to the study and that I could look at, whether these three groups were statistically same on this variable that was not something that the authors could have manipulated their results with awareness of. And I had to concede that even here, there really just wasn’t much sign of imbalance. So I came away, more or less saying I had to believe in the worms study and in the follow-ups that we use.

However, there’s been a few other studies … well, I’ll talk about one other study that was also reinforcing our faith in deworming, which I have now come to strongly question. It was not randomized. I think I mentioned it already, in fact. It’s by Hoyt Bleakley, of deworming in the American South. The reason it was compelling was that he seemed to show some very sharp jumps over time in schooling rates of kids after the campaign, and then also when they reached adulthood, he seemed to show some nicely and sharply-timed increases in their earnings.

That lined up really well with the research from Kenya, where we saw the same thing. Higher school attendance quick, right away, and higher earnings long-term. And the particular way it was done with this kind of seeming sharp jumps also made it pretty convincing even though it wasn’t a randomized experiment.

But with the help of research assistants, I rebuilt the original census data, and actually there’s data from other sources as well. Big project because he took a lot of data from hundred-year-old books, and had to be typed in manually, and there was one data point for each county in the South. Like a thousand counties. Lot of work to pull the data together.

And I, in my closest replication, I just didn’t see a clear sign of a sharp jump that appeared consistently through the different runs. And then when we expanded the data set, because more and more census data is being digitized, so maybe he had only a, I don’t know, 1% sample for 1910. Now we have a 100% sample.

When I expanded the data, any suggestion of a sharp jump where we would expect it if the campaign was the cause, was further smoothed out. And so what it looks like is that there was long-term convergence both within the South and between the South and the rest of the country on outcomes such as amount of time spent in school and adult earnings. But nothing, no sudden jumps that would be easily attributable to the deworming campaign.

Robert Wiblin: So how did you get the data you needed to do these replications?

David Roodman: In that case, it was hard work of … I did some of it, and research assistants did other parts of it, of hunting down old books. Some of them were in Google Books. Some of them were in my neighborhood library, which is the Library of Congress. Which is very fortunate. And we just had to scan, photograph, type in, do error checking where we could.

And then the census data comes from a fantastic project called IPUMS. I-P-U-M-S, which, I won’t try to figure out what it stands for, where they are digitizing more and more census data, not only from the United States but from other countries, and providing a really great interface that allows you to choose the years you want and the variables you want and download it.

And that’s all built … brought together in a Microsoft SQL Server Database. And then from there, once the actual data tables that we need for analysis are synthesized, that is then exported to Stata for analysis.

Robert Wiblin: So it sounds like GiveWell’s support of deworming then falls mostly on just one paper from the ’90s? That sounds concerning, right?

David Roodman: Yeah, I worry about it. What’s happened as a result of my scrutiny is that our research base, which seemed kind of reassuring … I haven’t talked about all the studies. There are two or three others, has thinned. And so we’re basically relying on this one experience in Western Kenya. And question is, “What do you do with that?” Right?

My impression is that in the public health world, world of medicine, you erect certain threshold tests, and so you say one study is not enough, or p-value on this study is not below .05, therefore we reject it completely. So we’ve gotten into some debates with people who come out of public health, especially in the UK, who just think we’re crazy. “How can you recommend this intervention if you’ve got one study that’s saying it’s got positive effects and others that say it’s indistinguishable from zero?”

I think a correct formal answer, I’m not sure if it’s ultimately practical, is that we need to be Bayesian about it. We’re in a situation where indisputably, the evidence is weak. I think the definition of weak evidence is that your priors matter, all right? If the evidence were compelling, it almost wouldn’t matter what we thought before we came into the experiment, and we’re not in that situation.

But suppose we draw some bell curves representing our general understanding of the impact of deworming. For the worm study in Western Kenya, that bell curve would be to the right of zero. A little bit of the tail would be on the left of zero, which would mean it’s probably got a positive impact. And then we could combine that with the other studies that are producing bell curves that are centered around zero. Then we might bring in our own prior, based on what we know about the benefits of childhood interventions, generally, from other research that’s not about deworming, and then we could fuse that together. We might get an overall estimate, represented by a bell curve, with some spread to represent our uncertainty, which, who knows? Might have 20% or 30% of its weight to the left of zero, depending how you do it. There’s [inaudible 01:11:58] ways to do it.

And so we would say, our best central estimate is that this is doing good, but we are not hyper-confident of it. And that’s an uncomfortable position to be in, but if we’re true expectation maximizers, if we’re being rational about this, then we should still favor the intervention.

Robert Wiblin: Have you scrutinized the experiments that find that there’s not much impact and come up with any possible explanations for why that’s the outcome?

David Roodman: I’m not aware of long-term studies that didn’t find much impact as originally published. Now, maybe there’s some publication bias there. A couple of them, because of my scrutiny, I now read to be saying that. There are lots of short-term studies that find little impact, and no, I have not looked at those.

Robert Wiblin: Did you get a lot of people to check your work, given how contentious this question has been?

David Roodman: Not a lot. However, I always send it to the original authors and all the data and code are posted online. So I hope that people are getting into it. At least, have the opportunity to do so.

Robert Wiblin: If your conclusion about deworming is wrong, what do you think will be the most likely reason? And I guess in this case, wrong would be that it’s clearly good or clearly bad.

David Roodman: I think the thing that I worry about most is that there’s some kind of selection bias. Maybe there are 10 plausibly good interventions out there, that are like deworming, and all have been the focus of similar research. And just by chance, this is the one that got the nice p-value in the original study. Or, to be more precise about it, maybe just by chance there was some true imbalance in the original experiment. Which can happen. You know, that’s generating all these results, and so then we’re gravitating to the one thing that just by chance, is looking good. That’s what I worry about.

Robert Wiblin: Why do you think this question has been so contentious and … why don’t people mostly agree with you that kind of … with low confidence, we can say it’s maybe positive?

David Roodman: Yeah. That’s a good question. I think part of it’s because it’s been studied by people in two different tribes: health and economics. And they bring to it different priors about standards of evidence, which maybe can be viewed as different Bayesian priors, and those priors seem so right to them that they can’t understand the other side.

Reminds me of … I’ve been reading recently on moral psychology, you know, book by Joshua Green where he talks about how we’re evolved for cooperation within tribes in order to compete with other tribes. One of the problems is that different tribes have different concepts of what is right and wrong, and cannot see eye to eye no matter how hard they try.

Robert Wiblin: So is the issue here that the medical tribe is just more skeptical that any treatments work?

David Roodman: Maybe so. I try to give them the benefit of the doubt and imagine that their norms are evolved for a world in which powerful medicines typically do have side effects. In which research may be funded by drug companies, and therefore needs added skepticism. It may be part of that.

Robert Wiblin: So do you think … were people convinced by what you wrote? Are we getting close to the last word, at least, on these old papers?

David Roodman: I know for a fact that the leading public health skeptics of deworming were not convinced by what I wrote. I really would like to try to do some … I feel the impulse to do some kind of formal Bayesian analysis like I describe. Let’s state a prior, let’s synthesize the evidence that we have from different sources, including the stuff that says it’s indistinguishable from zero. Let’s come up with our best estimate for distribution. Let’s not impose a senseless .05 test, since that’s arbitrary. And let’s ask ourselves, “What looks more likely? That it’s helpful or not?”

Robert Wiblin: How hard is that to do?

David Roodman: That’s a good question. I had a conversation just a couple days ago with Ozzie Gooen, if I’m saying his name right. Who created Guesstimate which is a wonderful tool for trying to do computations and explic