Suppose we are interested in the effects of some social psychological construct that we are theoretically devoted to (let’s say “symbolic racism”) on support (or lack thereof) for (generous) social welfare policies. In quantitative social science we would spend a lot of money surveying people, collect some data, and ultimately specify a regression model of the form:

Y=a+bW+cX+e (1)

Where Y is some sort of scale or that lines up individuals in terms of their support for social welfare policies, W is some sort of scale that lines up individuals in terms of their “symbolic racism” is a matrix of other “socio-demographic” stuff and e is a random disturbance. Suppose further that the model provides support for our theory; b is substantively and statistically significant and its sign goes in the right direction: the more symbolic racism the less support for social welfare policies. We would then write a paper arguing that individuals who are high in symbolic racism are less likely to support social welfare policies, and this is a likely source of support for the Republican party in the South, we might even insinuate in the conclusion that trends in income inequality would be much less steep if it wasn’t for these darn racists, etc.

I would bet you 10,000 dollars*, however that in actually presenting their results and their implications the authors would say things that are in fact not supported by their statistical model. In fact we all say or imply these things, especially when W is an attitude (or some other “intra-individual” attribute) and Y is a behavior, and we desire to conclude from a model such as (1) that attitude is a cause of the behavior (the same thing would apply if the unit of analysis are organizations, and W is some organizational attribute–like the implementation of a “strategy”–and Y is an organizational outcome).

Now suppose even further that W passes all of the (usual) hurdles for something to constitute a cause: it precedes Y, the model is correctly specified on the observables, etc. My point here is that even if that were true, it is not true that from the fact of observing a large and statistically significant effect of b we can conclude that at the individual level there is some sort of psychological (intra-organizational) process with the same structure as our W called “symbolic racism” that causes the individual’s support for this or that policy.

An obscure segment of the statistical and psychometrics literature tells us why this is the case (see in particular Borsboom et al 2003): in order to jump from information that is obtained from a comparison between persons to statements about the data generating process within persons, we must make what is called the local homogeneity assumption. This assumption is just that; an assumption. And for the most part it is a shaky one to make. For b in (1) only gives us information about the conditional distribution of Y responses among the population of subjects as we move across levels of W; it says nothing about causal processes at the individual level. In fact the model that produces responses at the individual level could be wildly different from (1) above and yet it could generate the between-persons result that we observe. In this respect, the statements:

1a. Our results provide support for the conclusion that in the contemporary United States a person with a high degree of symbolic racism is less likely to support social welfare policies than another person with a lower degree of symbolic racism.

1b. Our results provide support for the conclusion that a person’s support for punitive welfare policies would decrease if their propensity towards symbolic racism were to decrease.

Are empirically and logically independent. Model (1) only supports 1a, but it says nothing about 1b (or would only say something about 1b under the weight of a host of unsupportable assumptions). However, whenever we write up results obtained from models such as (1), we sometimes present them as if (or insinuate that) they provide support for 1b.

Startlingly, this lack of (necessary or logical) correspondence between a between-subjects result and the DGP (data-generating process) at the individual level implies that most statistical models are useless for the sort of thing that people think that they are good for (draw conclusions about mechanisms at the level of the person/organization). Not only that, it implies that a model that provides a good explanatory fit for within individual variation (let’s say a growth curve model of the factors that account for individual support for social welfare across the life course) might be radically different from the one that provides the best fit in the between-persons context. Finally, it implies a “rule” of sociological method: “whenever a within-subject explanation is extracted from a between subjects analysis we can be sure that this explanation is (probably) false (at least for most non-trivial outcomes in social science).”

*I don’t actually have 10,000 dollars.

41.679050 -86.254040