0:36 Intro. [Recording date: May 3, 2010.] State of econometrics--the application of statistical techniques to economic questions. A few weeks ago, Tim Harford wrote a piece in the Financial Times referencing a piece you wrote back in 1983 called "Let's Take the 'Con' Out of Econometrics." Harford argued we've finally succeeded in solving at least one crucial problem--it did take 27 years--but we've finally removed the con; we've got more honesty. Particularly he was focused on the identification problem. Claimed he was referring to work by Angrist and Pischke, who argued that by use of so-called natural experiments and modern techniques, we've been able to get a much better assessment of relationships in economic data. First, talk about your 1983 piece; what was the con that we ought to be aware of? The con is that depending on what model you select, you can get dramatically different estimates and conclusions. Economists have not spent enough effort alerting their customers to that sensitivity. That's the con--pretending that the data sets are providing more information than they possibly can because the econometric method requires you to make a complete commitment to assumptions that you have at best a half-hearted commitment to. I was arguing what we need to do is develop tools that researchers can use that would separate sturdy from fragile inferences. Sturdy ones are the ones that don't depend much on ambiguous assumptions. The fragile ones change with a very slight change in the model you happen to use. We need first of all tools that will help us sort the sturdy from the fragile conclusions. Secondly we need a method of communicating that in the articles we write and a culture that is receptive to that. The culture as it is now is a "maximize-the-t"-kind of culture, which is a way of saying find something in the data set--and there are two reasons why there might not be something in the data set. One is the data set might be too small--what econometricians call "collinear"--and the other is that the assumptions that you need are not really credible. Economists by and large don't want to hear that kind of negative. They want to hear that they are making major conclusions from the data sets. Clarifications: When you talk about a "t", the t-statistic in a statistical study is a measure of how likely or unlikely is it that the relationship you found in the data is due to chance. A high t-statistic would mean it's very likely that this relationship is there and not just some fluke. The word statisticians have suggests that it is statistically significant, which we summarize by saying "significant." But "significant" really means "important," and it's not the same. Highly recommend that we use the word "measurable." We want to know if this data set allow you to measure the effect; whether the effect is big in an economic sense is a totally different issue. McCloskey and Ziliak book where they attack the whole concept of modern econometrics on the grounds that we've become obsessed with the relationship between two variables is significant--it could be very unimportant; it could be small in its magnitude and impact but significantly different from 0, meaning it's not just a chance. Key distinction we care about as economists. We don't often have a conversation about what size of coefficient do we need for this to be an important effect. Difficult conversation to have. Instead we turn it over to statisticians who decide what's significant or not based on these t values that really don't have anything to do with the setting and are context-free. Economists need to impose more order in the conversation and not relinquish the most decision, which is to decide whether or not this is really an important variable or important effect.

5:49 Going back to 1983 article, mentioned being explicit about our assumptions or how sensitive our results are to our assumptions. When economists talk about their assumptions, they are usually talking about things like "I'm going to assume that businesses are profit-maximizing," or "I'm going to assume that individuals are maximize their utility." But in statistical work, econometrics, when you talk about assumptions, you talk about very specific assumptions about where the data come from, the way the errors might be distributed, whether the relationship is linear, quadratic, cubic. Critical task in the art of drawing inferences from a data set: how to translate a conceptual framework, theory, model, which by its very nature is a simple version of reality, into a compelling and persuasive data analysis. Your theory might say demand curves slope downward. That's not nearly as complete a statement as is needed for a statistician or econometrician to do the data analysis. The data analysis requires that you select a particular functional form; allow for the fact that this year's consumption may depend on last year's prices as well as today's. Tomorrow's as well--expectations. Have to think about the other variables that are going to drive the demand and not just the price. A theorist can get away with making a vague statement that quantity demanded depends on price, but a data analyst has to fill that in to make a very explicit model that has no doubt associated with it. If there's any doubt, it's the random error that we tack onto that model. The doubt about that is distributional assumptions about which the theorists have no opinion. Huge step between conceptualization of the problem and building a model that can capture that framework.

8:23 For those listeners who are not practicing or would-be economists or graduate students in economics, etc., want to set the stage. In an economics journal, or a medical journal in epidemiology where we are going to look at the relationship, say, between drinking and cancer or in economics between some piece of legislation like the minimum wage and whether it affects employment or not, what you'll find somewhere in that article if it's an empirical article, is a table or a chart that purports to show that the relationship between the two variables that we care about is of such-and-such a magnitude and is not due to chance. What is hidden from us as the readers and is the unspoken secret Leamer is referring to in his 1983 article, is that we don't get to go in the kitchen with the researcher. We don't see all the different regressions that were done before the chart was finished. The chart was presented as objective science. But those of us who have been in the kitchen--you don't just sit down and say you think these are the variables that count and this is the statistical relationship between them, do the analysis and then publish it. You convince yourself rather easily that you must have had the wrong specification--you left out a variable or included one you shouldn't have included. Or you should have added a squared term to allow for a nonlinear relationship. Until eventually, you craft, sculpt a piece of work that is a conclusion; and you publish that. You show that there is a relationship between A and B, x and y. Leamer's point is that if you haven't shown me all the steps in the kitchen, I don't really know whether what you found is robust. Kitchen reference, old joke: two things that you don't want to see in the making--one is econometric estimates and the other is sausages. Dirty process. Why? Example: the theory might suggest that a feather in a vacuum will accelerate at a constant rate when it falls. But economists don't observe feathers in a vacuum. They observe feathers when the wind is blowing, when the humidity varies, eagle feathers, duck feathers. Tons of things that are going to affect the speed at which things fall. Theorists are allowed to hypothesize that vacuum, but the real world doesn't have that vacuum. Got to translate that into a complete model with all the controls, the kind of things we were just identifying. You and I can sit down and think of these controls--you and I will come up with different lists; tomorrow I'll come up with a different list from today's. That's a sensitivity issue--we want to make sure that an adequate range of alternative models has been studied and confirmed that all the reasonable models lead to about the same conclusion, which is that you get the sturdy inference. Or, if what seem like small changes in the models, the kinds of things that economists would be willing to entertain, lead to dramatically different conclusions--that's a fragile estimate, not to be believed. Suggested, alongside this work of art, suggested you should also include some of the souffles that fell, some of the dishes that didn't work out so the reader could judge if there is a real relationship there. How has the profession reacted to that suggestion? Economists will have a table of alternative estimates. But there's been no awareness that this is a critical issue. A lot of work with complex econometrics but not a lot of progress with building tools for identifying sensitivity of our conclusions to our assumptions, or for reporting adequately that sensitivity. Still in same operating procedure as 30 years ago--to cook the books.

14:13 Why must be because there is no incentive for us to do otherwise. Want to come back to that, but staying on track: What are some of the more recent techniques in econometrics, particularly the use of instrumental variables to create so-called natural experiments, and what the proponents are claiming about those techniques? Angrist and Pischke paper well-written, will be out in Journal of Economic Perspectives this month, making what seems like a compelling case that randomization is the solution. Meaning that in an experimental situation, you have purposeful randomization: try to decide whether fertilizer affects yield, so you randomly select plots that get fertilizers. Look at treated and non-treated plots--measure of fertilizer on yields. Only job in that setting is to determine whether the data set is large enough that you have a statistically significant finding; or is it too small relative to the size of the effect that it leaves open the possibility that what you are observing is pure randomness and not a real effect. That's the traditional view about experiments--if you do the experimental design adequately with controls and then do the randomization you will get a proper causal conclusion--to which I totally agree. We call that science. The problem with that is you created that in a laboratory. There is no assurance that it will translate into the same effect in the real world, particularly in economics because we are talking about a social system; and an expectational aspect also--makes the transference from laboratory to real world hard. Those are purposefully randomized experiments--purposefully designed. Instrumental variables is a reference to accidental experiments--scurry around trying to find out something that is as if you had an experiment. Example: what does immigration do to a community? Look at thousands of Cubans when they fled Cuba, Mariel boatlift, and study the impact that has had on the community, which is what David Card has done. The argument being that since that was an exogenous event--not correlated with anything else going on in Miami at the time, not like Castro said things are great in Miami so let's let the people out, which would confound the statistical relationship, or things were horrible in Miami so he let them out. A random political event that is outside the causal relationship we are trying to study. Economists think of that as being tantamount to the idea of a randomized experiment. Problems: First, there's no such thing as a really exogenous variable. We don't know how much Castro was looking over to see what was happening in Miami, so there's a possibility that that boatlift was responding to something that was happening Miami. Every one of these is going to open up conversation about whether it is really a randomized treatment or whether it's correlated with the impact you are trying to determine. But does a boatlift tell us anything about a 2000-mile fence? Translating that to impact of immigrants in other settings is difficult. Takes the same kind of work it takes to draw conclusions from non-experimental or observational data--have to think long and hard about the circumstances that have affected that outcome and put in control variables.

20:04 Previous podcast on macroeconomics--boatlift immigration issue is micro problem, but in macro we make that leap all the time when we talk about aggregate demand. When someone says in the past, $1 billion had this impact on the economy--so much unemployment, this level of growth--people are presuming that the same structural relationships still hold. Even though the cause of the recession might be totally different, what the money is spent on might be totally different, implicit in those multiplier arguments is the presumption that it doesn't matter. Find that very strange. Let's be more explicit. If you just look at correlation over time, it doesn't tell you anything about causal impacts. So you need something like a randomized experiment. If you want to know: Does government spending have a multiplier? then you have to have a treated group and a control group. In the case of macro, very difficult to think of what is the natural experiment, whether it's purposeful or natural, that we can use to make conclusions about the impact of federal stimulus programs. The one that comes to mind is defense spending--end of war, start of war. Robert Barro has used that--interesting, useful. Clever. Angrist and Pischke. But does apparent defense buildup in WWII tell us anything about the stimulus package that the Obama administration put together? Doesn't seem to have any relationship, or not be an automatic corollary. Sympathetic to Barro's conclusion but have to admit that the scientific nature of it is somewhat problematic. Didn't stop the people who are not sympathetic from saying it was just totally wrong. Bizarre that scientific work by macro- or microeconomists on anything that we care about, e.g., quality of schooling podcast with Ravitch, crucial social policy issues that we all have strong feelings about--the empirical work, no matter how careful or clever doesn't seem to change anybody's mind who is not already a believer. That means it's not science. In science there is skepticism, too; takes a while for people to come around. But it doesn't happen at all in economics. Incentives: the consumers of this work realize that is little incentive to get it right in a scientific sense; there is an incentive to reconfirm what you already believe. There is also a belief that there is another side, and the other side could produce some kind of model; and I'll wait until I see the whole thing work out before I draw any firm conclusions. Like a court of law in which you see the plaintiff's argument but you are not allowed to see the defendant's; not going to make a judgment till you see it all worked out. When you hear only one side, if you are sympathetic to that side you are cheering the whole time the argument is being made: there's nothing the other side can say; but they manage to. Commentary Magazine letters to the Editor would savage the article; would think the author didn't have a leg to stand on. Strangely enough, the author would show why his antagonist didn't have a leg to stand on. Insider or not, sometimes there is no way to choose in any objective sense--you don't have any information. Aggressive language--economic theory is fiction: sometimes good, insightful, sometimes boring, but fictional representation of the world; and economic analysis is really journalism. Journalist's job is to marshal the facts and put them together persuasively; but it's not science. Fiction and journalism. The people who swear by these techniques--Angrist and Pischke as an example--what would they say to your criticism? Angrist and Pischke would be sympathetic; understand their point, too--randomization is great if you have it. Experiments can be highly useful. Just don't think any single path will work. Theories; studying data sets in different ways; but to think that designing experiments is going to suddenly change economics into an empirical scientific discipline seems unlikely. That might be where we have some significant disagreement. Often the creators of techniques are less enthusiastic than their followers. Their followers tend to be drinking the kool-aid and have forgotten all the admonishments of the creators about what they should be careful about and watch out for. Incentives: read in one of your articles--you make a mention of the fact that the only people who believe the results that come forth are the author. How could that be? Gotten into the habit of asking people if they can name an econometric study that caused the profession to come to a consensus about something controversial. Most economists struggle to come up with an answer. Some economists name their own work--extraordinary. Isn't it strange that in our field so many people are spending so many hours churning out results that nobody takes seriously? Would like to make a distinction between the process and the outcome here. The process helps us think better as economists. Analyzing data sets, complex ambiguous settings, helps us think clearly. Same with economic theory--carried out mindlessly it's a total waste of time--but there are people who can do theoretical manipulations, make discoveries, and learn things through that process. So, even though the final model may be silly and the table of t-statistics may be irrelevant, the process helps us form judgments. Social conversations we have also help us come to conclusions--often not the right ones, but some scope for progress.

30:10 Pessimistic note: agree in a world where we sit around in our togas and try to come to agreement on these relationships. But it doesn't quite work that way. What happens is the more exotic and dramatic your result, the more likely you'll be featured in the NYTimes. The university likes that, so there's a real bias toward shocking claims, contrarian, bizarre claims. Recent example: the Wall Street Journal had a piece on the front page of its weekend section about two or three weeks ago that when Tiger Woods enters a tournament, instead of encouraging people to try harder, they just give up. The implication is that our whole understanding of competition has to be reconsidered, because we usually think of competition as bringing out the best in people, people striving to meet the high bar the competitor provides, but with Tiger Woods, he's so dominant that people just give up. As a result, competition has this destructive effect. Lesson: that's not enough--we've got to apply it to business. The implication is that businesses shouldn't try to hire the best people, because if you bring in a superstar, people could just sit around and say "I'll never get a big bonus." One of the examples given in the story was at General Electric, only the top 20% get the big bonuses, so a superstar could discourage people from being in that top 20%. Student's joke--well, if they have 5 employees, that would be true. But they have more than five. Article was based on an unpublished article by researcher at Northwestern who discovered by carefully teasing out and controlling for all the relevant factors, that when Tiger Woods enters a tournament, his opponents score higher by 8/10s of a stroke--meaning they perform worse. All this econometric firepower brought to bear. How many regressions were run where the result was the other way that you didn't tell me? Unless I know that, why would I have any confidence in that result? Leamer: Have heard that paper presented; not as skeptical about the basic finding, but skeptical about the interpretation. My quality of golfing is much influenced by the people I play with. Russ: Told colleague Don Boudreaux about this finding, he said sure it discourages people from golfing--"I don't go into golf because Tiger Woods is there." There are millions of Americans who have decided to take up other pastimes, tennis, because they don't think they can beat Tiger Woods. No doubt true. Also true that if you are paired with him, or with Larry Bird, famous trash-talker in basketball, it could affect your performance in a negative way. Might regress toward a lower level. What is not true is that golfers who are already in the sport didn't give up when Tiger Woods came along. They worked incredibly harder--started lifting weights, stopped loafing, put in more hours. Statistical finding--remarkably small--author notes that many tournaments settled by a single stroke; response: but no tournaments are settled by 8/10s of a stroke. It's only an average; some would be affected by a larger amount. That's the not the crucial point. The crucial issue here is that Tiger Woods doesn't enter all tournaments. That's the crucial experiment--the randomization experiment that's been created. He tends to enter the harder tournaments. Jennifer Brown, the economist who studied this, controlled for that. But that isn't the real comparison. Harder question: let's look at golfers who were golfing before Tiger Woods came along, and then after he came along, and let's see whether on the similar courses before and after whether they took their game up a notch or said they'll never win. Her analysis is addressing a different question: within a given year, how do these players play in tournaments that Tiger is in versus ones he is not? If I thought I was a competitor with Tiger Woods and I saw him making some of the impossible shots, I could easily be lulled into thinking I could make those same shots and giving it a try, harming my score as a consequence--that would be one mechanism. Not making less effort, but trying things I can't do. Might take more chances; might decide to play for third, which does pay a lot, so might get more cautious. Open, by Andre Agassi, similar conversations with himself when he had to play Pete Sampras, dominant player of his era who usually beat Agassi. That's not the real question: the author of the article and economist who did the study don't just want to show about athletes in times of stress--want to generalize it to general notions about competition. Tiger impact might be true; but does it generalize to other settings like corporations? Even less than Mariel boatlift study tells us about Mexico and the United States. Don't know how to compare the two.

38:55 Macroeconomics: interesting? any soul-searching going on in the profession? Things we didn't understand about home prices and macroeconomic activity. Wake-up calls? Probably not. Continue to live in our own cocoons, think of financial policy as somebody else's problem, doesn't affect us. Huge swing in the profession away from monetarist and rational expectations models in favor of simple Keynesian models, without any basis. Not everybody has swung that way, but surprising how many in the profession have been endorsing these stimulus packages. I think I know the answer to how the economy works, too! In a healthy economy when someone loses their job it doesn't precipitate job loss, but when the economy becomes unhealthy, it creates feedback loops, which means that some job loss creates other job loss; and the government needs to help prevent that negative feedback loop, demand management, but only during those few episodes. For example now we are in the self-healing phase and the job of the government should be entirely to eliminate the uncertainty. Problem: Stimulus package extended unemployment insurance, which tangles up looking for work in times of unhealth and the negative feedback loop. We decided to pay people not to look for work; made it cheaper to be unemployed. But, we gave them money--according to the Keynesian model that kind of makes up for their being unemployed--it keeps demand going. Messy system to separate out. Leamer: Opinion--it's all opinion and no data behind it. Thought Russ was going to say that unemployment insurance was increasing the unemployment rate. Russ: I think it does. Believe demand curves slope downward. Other things going on. Challenges of predicting accurately. A lot of people justified those unemployment extensions on the grounds of aggregate demand; kind of forgot that it would encourage people to not work as hard as they otherwise would. Not the only reason unemployment is responding slowly to the recovery. Agree that demand curves slope down; but long run can be different from the short run. If you put in place incentives that pay enormous benefits if you are unemployed, you definitely get more unemployment; but in the context of a cycle people tend to think of themselves as either working or not working, and that self-categorization is not much affected by the benefits they are receiving. They are out there hoping they are going to get a job. We need to do some data analysis to find out who is right here. Eagle feathers, duck feathers, windy day--with housing collapse, we might expect that unemployed construction workers might be in an unusual situation relative to past downturns that were more general. About a quarter of the people who are unemployed since the last peak, December 2007, are in construction. They are going to have a hard time trying to figure out whether they should stay in construction or not. A lot of uncertainty and imperfect information.

44:55 Pedagogy, educational question. Teach a class on how to think about numbers, how to be skeptical about relationships you see; the way journalists misreport with confidence that isn't justified. Teach journalists the same principles trying to teach them to be more skeptical. People take a lot of things on faith. One response is to say all empirical work is garbage, to dismiss everything. Confirmation bias, journals generally only publish positive results, etc. Podcast on book Macroeconomic Patterns and Stories; argue that we need both. All we have in the area of macro is opinions. Teach a course called Turning Numbers into Knowledge; final exam is for the students to read the testimony of the Federal Reserve Chairman to Congress and pick a sentence out of that; then look at data sets to see whether they can confirm or cast doubt on that opinion. Process. Profession is way too heavy toward theory; macro has completely ignored an enormous data base that could have an impact on how we understand the economy. People have imposed a particular structure, a straitjacket, on the data which prevents them from learning how this complex economy actually evolves. What straitjacket? You give me your model--the overlapping generations model, the rational expectations model, the Keynesian model--all the forecasting models are simple Keynesian models. Commit yourself. Example: wrote a paper saying housing is the business cycle. Housing is absolutely critical; great leading indicator but also contributes a large fraction of the jobs. Construction--large fraction of every one of the downturns we've had. Disadvantage in writing a macro paper--not a macroeconomist. Advantage in writing a macro paper--not a macroeconomist. Came to that question as a student of data. If you look at the data without the straitjacket, without having a horse in this race, not a Keynesian, Austrian, monetarist, rational expectations guy--the data shouts, screams that housing has something to do with almost every cyclical downtown of the post-war era. Skilled and respected data analyst, but not a skilled and respected macroeconomist; have expertise, reputational credibility--category might be annoying rather than skilled and with credible reputation. No macroeconomist say your paper and said this is something kind of important. Marty Feldstein told Leamer he saw the paper and didn't know that about housing. Honest man. Another prominent economist expresses annoyance and said he already knew all that stuff. Why didn't you write it in that book you wrote? "It's in there somewhere." Another not to be named implied he didn't know what you were talking about. This is what's there. You can't say it's not relevant. Do they have a different answer? You've done these interviews! Benign neglect. Odd given that virtually every macroeconomist in the world would concede that housing had something to do with this downturn. Interesting question how much of it was due to feedback loops between housing and the financial sector, but nobody denies that housing was a precipitating factor here. Paper was written in 2007--this guy's a real prophet! Thought police, treat you with disdain. Marched to my own drummer; lonely. In own personal odyssey, find work compelling. Don't remember what I might have thought of it in 1985, but thought you've got to do something. But maybe you don't. Humility, as a profession. Not much incentive for economists to think that. If you want to be in the newspapers, you've got to be overconfident.