Peter Dorman is one of those rare economists that it is always a pleasure to read. Here his critical eye is focused on economists’ infatuation with homogeneity and averages:

You may feel a gnawing discomfort with the way economists use statistical techniques. Ostensibly they focus on the difference between people, countries or whatever the units of observation happen to be, but they nevertheless seem to treat the population of cases as interchangeable—as homogenous on some fundamental level. As if people were replicants.

You are right, and this brief talk is about why and how you’re right, and what this implies for the questions people bring to statistical analysis and the methods they use.

Our point of departure will be a simple multiple regression model of the form

y = β0 + β1 x1 + β2 x2 + …. + ε

where y is an outcome variable, x1 is an explanatory variable of interest, the other x’s are control variables, the β’s are coefficients on these variables (or a constant term, in the case of β0), and ε is a vector of residuals. We could apply the same analysis to more complex functional forms, and we would see the same things, so let’s stay simple.

What question does this model answer? It tells us the average effect that variations in x1 have on the outcome y, controlling for the effects of other explanatory variables. Repeat: it’s the average effect of x1 on y.

This model is applied to a sample of observations. What is assumed to be the same for these observations? (1) The outcome variable y is meaningful for all of them. (2) The list of potential explanatory factors, the x’s, is the same for all. (3) The effects these factors have on the outcome, the β’s, are the same for all. (4) The proper functional form that best explains the outcome is the same for all. In these four respects all units of observation are regarded as essentially the same.

Now what is permitted to differ across these observations? Simply the values of the x’s and therefore the values of y and ε. That’s it.

Thus measures of the difference between individual people or other objects of study are purchased at the cost of immense assumptions of sameness. It is these assumptions that both reflect and justify the search for average effects …

In the end, statistical analysis is about imposing a common structure on observations in order to understand differentiation. Any structure requires assuming some kinds of sameness, but some approaches make much more sweeping assumptions than others. An unfortunate symbiosis has arisen in economics between statistical methods that excessively rule out diversity and statistical questions that center on average (non-diverse) effects. This is damaging in many contexts, including hypothesis testing, program evaluation, forecasting—you name it …

The first step toward recovery is admitting you have a problem. Every statistical analyst should come clean about what assumptions of homogeneity are being made, in light of their plausibility and the opportunities that exist for relaxing them.