There is an apocryphal story about the visit of the great atheist philosopher Diderot to the Russian court.

Diderot was quite the clever debater, and soon this scandalous new atheism thing was the talk of St. Petersburg. This offended reigning monarch Catherine the Great, who was a good Christian woman –

(except for the affair with the horse)

(if you’re not familiar with this story, “affair with the horse” should be taken in the most literal possible way)

(the affair with the horse is totally legendary, but so is the rest of this story, so adding it in doesn’t make things any worse)

– anyway, it offended Catherine the Great, so she asked legendary mathematician Leonhard Euler to publicly debunk and humiliate Diderot. Euler said, in a tone of absolute conviction: “Monsieur, (a+b^n)/n = x, therefore, God exists! What is your response to that?” and Diderot, “for whom algebra was like Chinese”, had no response. Thus was he publicly humiliated, all the Russian Christians got an excuse to believe what they had wanted to believe anyway, and Diderot left in a huff.

This story is very likely false, but it’s something I think about a lot.

I feel like I am a lot better at the sorts of things Diderot was good at – philosophy, history, social science, et cetera – than at math. Sometimes, I will have a belief that seems pretty well-founded based on the sort of arguments Diderot would have been able to come up with – and someone will spout a bunch of very complicated math at me and tells me it disproves my belief. Sometimes if I concentrate hard enough I can understand the math well enough to see if they’re right, but this is very difficult. Otherwise, short of going back to school for ten years and getting a math Ph. D, I’m pretty much stumped.

So I think about Diderot a lot because I want to know what to do in this sort of situation.

The easy out is always to dismiss the math as sophistry – say “Mathematicians like Euler can sound very technical and impressive. But I’m pretty sure of this argument, and math has little to say about this non-mathematical field. So I’m not going to let myself get Eulered.”

But math is systematized rigor, and sometimes you genuinely need a lot of rigor to find flaws in arguments. I’m not totally hopeless at math, and I think of all the things that even my limited amount of mathematical ability allows me to understand that mathless people wouldn’t get. One of my go-to examples here is health insurance and the idea that it is wrong to ever deny someone coverage for anything, no matter how low the probability it will help and how expensive the intervention. I try to make an argument against that here, but it requires a little math and it would have been very hard to express clearly without it. I worry there are people who think I’m just trying to Euler them, and who are going to continue believing what they want about “death panels” because no fancy numbers can change their minds. And I also worry that if I dismiss mathematical arguments above my level of comprehension, I’m doing the same thing.

This is an obvious trade-off. Permament easily-fixable lack of rigor, versus letting anyone with a BA in Mathematics push you around.

There’s a bit of variation depending on the mathematical field. The latest people to make a serious academic attempt to prove the existence of God with math, Tim and Lisa McGrew, used Bayesian probability, a field of math which thanks to the excellent explanations on yudkowsky.net I at least know a little bit about. As a result I was able to avoid getting Eulered and write what I think was a pretty devastating rebuttal.

But there are other mathematical arguments I find much less tractable. And by far the most dangerous is statistics.

With apologies to Rutherford, all science is statistics or stamp-collecting. It is very well known that entire fields of science are permanently messed up because their statistics aren’t good enough. I have tried very hard to outperform most of my famously statistically illiterate profession, but there will always be people far better than I. There will always be people making extremely detailed and Byzantine methodological critiques. And those people will always be able to present me with arguments that are either ultra-important debunkings of things I believed religiously, or else shameless attempts to Euler me. And I will always have trouble figuring out which ones are which.

I am especially reminded here of Fisher’s work on smoking and lung cancer. Fisher was (according to Wikipedia) “a genius who almost single-handedly created the foundations for modern statistical science”, and launched a bunch of very sophisticated critiques against the idea that smoking caused cancer. His basic argument was that proving causation was very very hard (which it is) and that none of the appropriate statistical work had been displayed in cancer research. For example:

He uses a device to magnify the difference and its importance in the case-control studies by transforming the percentages into observed versus expected figures (using a chi-square analysis). He then suggested that, if the cases had inhaled, 45 lives could have been saved.

Intriguingly, his work on the subject became the foundation of the modern truism that “correlation does not imply causation”. Also intriguingly, he was taking money from the tobacco industry to serve as their “consultant” while he was doing it.

It is easy to imagine being a biologist back then, thinking you had lots of good studies showing a tobacco/lung cancer link, then getting pummeled by this statistical genius and backing off from your original claim.

And it’s easy to imagine another statistical genius arguing against him, and they’re both throwing out a lot of formulae and equations and the whole thing is super confusing.

And if, like me, you can only remember what a “chi square analysis” is on a good day, and you have enough trouble remembering the difference between case control studies and cohort studies, you’re probably not going to be able to follow the entire debate and pick apart exactly where one of them goes wrong.

But you know, on a good day I remember my chi-square analyses, I get the difference between case control and cohort studies right, and then I can read an R. A. Fisher paper and think “No, smoking is still bad.”

What about Glymour on IQ?

There’s a consensus among researchers in the field that IQ is useful and means what people think it means. And there’s a lot of research backing that up, same way as there’s a lot of research backing up the link between smoking and cancer.

On the other hand, Glymour seems to be well respected and intelligent, and he says things like:

Factor models assume that observed variables that do not influence one another are independent conditional on all of their common causes, an assumption that is a special case of what Terry Speed has called the Markov condition for directed graphical models. The rank constraints – of which vanishing tetrads are a special case – used in factor analysis are implied by conditional independencies in factor models, conditional independencies guaranteed by the topological structure of the graph of the model, no matter what values the linear coefficients or factor loadings may have. To exclude more latent variables when fewer will do, Spearman needed only to assume that the vanishing tetrads do not depend on the constraints on the numerical values of the linear coefficients of factor loadings, but are implied by the underlying causal structure. It is known that the set of values of linear parameters (coefficients and variances) that generate probability distributions unfaithful to a directed graph is measure zero in the natural measure on parameter space.

Even on my absolute best day, if I swallowed like an entire jar of modafinil, and then another jar of piracetam, and I looked up every one of those words in a dictionary, and took two or three hours to puzzle it out, I’m pretty sure I couldn’t bring myself to generate an understanding of that paragraph and sustain it for more than thirty seconds.

But there are thirty pages of that kind of thing, and then at the end it says “therefore, you should disbelieve in IQ and probably also all other research in the social sciences.”

Also, it mentions how research on IQ must be rejected because it might encourage the Republicans, whose plans will lead to a nation where “Ku Klux Klan schools, Aryan Nation schools, the Nation of Northern Idaho schools, Farrakhan schools, Pure Creation schools, Scientiology schools, and a thousand more schools of ignorance, separation, and hatred bloom like some evil garden, subsidized by taxes.” So clearly there’s some political motivation at work as well.

(Also: Anissimov! Did you realize you could get your Northern Idaho secessionist schools to be tax-subsidized? You should totally look into that!)

So I have to ask – am I being informed of deep methodological truths that are being neglected? Or am I being Eulered?

I don’t have a good way of answering this. The way I try to deal with it in practice is seeing if I can route around the objection.

Like it’s clear that Diderot’s best option wasn’t to try to argue that (a+b^n)/n didn’t equal x. Far better for him would have been to ask why, if (a+b^n)/n = x, this necessarily proved God. Even if Diderot wasn’t smart enough to understand the precise algebra involved, he might have been able to at least get the impression that what it was doing was defining X in terms of other quantities. So he might have been able to ask “Why does a certain definition of the meaningless quantity X disprove God?” even if, for example, he didn’t know what exponentiation was and couldn’t parse “b^n”.

My reaction to the Glymour paper was to try to figure out what it was trying to prove with all its statistics. My conclusion was that it was trying to prove that doing correlations adjusted for confounders didn’t always remove all the confounders.

I don’t have the mathematical ability to know whether Glymour’s argument is correct, but luckily I already don’t believe adjusting for confounders does a good job of removing confounders:

I will come out and say it: I do not trust the practice of “adjusting for confounders”, at least not the way this study does it. You are adjusting for an imperfect measurement of the confounders you can think of. If you find that there is lingering correlation, then either your hypothesis is true, or you didn’t adjust for confounders well enough.

So I tried to route my argument around Glymour’s objection. I said that even assuming Glymour had discovered something terrible and shameful about the way correlations and regressions were done in the social sciences, this doesn’t come close to debunking all research on IQ. My particular argument was:

The example I gave of good IQ research, which you said you’re not convinced is actually being done, is the connection between lead poisoning and poor life outcomes, mostly proven through IQ. Let me discuss what this research looks like and why it’s not just one guy running a correlation through SPSS without any awareness of possible confounders. First of all, there’s a LOT of evidence that growing up in neighborhoods with high lead concentration is correlated with lower IQ as an adult. This is all regressed for the usual things like socioeconomic status. Fine. That seems vulnerable to exactly the problems you describe. For example, maybe rotting houses expose people to more lead, and poor people are more likely to live in rotting houses, and poor people’s kids go to poorly funded schools that don’t teach them test-taking skills, so their IQ looks low Then they found a dose-dependent effect – ie the more lead you were exposed to, the worse the IQ drop was. Still pretty confoundable – if for some reason poor people used more lead (for example), poorer people might use even more lead (and have factors causing lower IQ test scores) Then they found that when different states removed lead from gasoline, childhood outcomes rose in a very predictable pattern. There was a dramatic improvement a certain number of years after the lead was banned – for example, maybe California banned lead in 1960, and in 1965 outcomes started to rise dramatically; Oregon banned lead in 1965, and in 1970 outcomes started to rise dramatically; Washington banned lead in 1970, and in 1975 outcomes started to rise dramatically. Once again, this could be confounded. Maybe liberal states were more likely to ban lead first, and also more likely to increase school funding first. Then they found that levels of lead in the air at time T was correlated suspiciously closely with crime at time t+1 – like if you line up the two graphs, every tiny little uptick and downtick match perfectly. Then they found that lead exposure during pregnancy decreases the head circumference of infants, which seems a little less malleable by things like poor school funding than IQ is. Then they found like thirty other things. I admit every one of those pieces of evidence is a correlation. But even though the correlation between lead levels in a neighborhood and crime in that neighborhood could be confounded by unobserved factors, lead levels in an era and crime in that era could be confounded by unobserved factors, lead regulatory regimes and crime in the area covered by that regulatory regime could be confounded by unobserved factors, and lead exposure during pregnancy and head circumference could be confounded by unobserved factors – at some point you have to say that we’re starting to rack up a lot of coincidences, and maybe we should just admit the theory has a point. And once you come up with some solid result, like the one with lead – then that becomes your basis for other results. Childhood lead poisoning causes brain damage thus lowering IQ? That lends credence to the idea that IQ is a useful measuring tool for some kind of brain health. Lead both decreases IQ and increases crime in a dose-dependent way? That lends credence to causal interpretations of the observation that IQ and crime are closely correlated. Once you’ve gone through this process enough times and you find that all of your results kind of fit together, you have what’s starting to look like a pretty impressive scientific edifice. So I think the criticism that IQ research (and social science in general) is just based on drive-by correlations and regressions, then accepting whatever they say, is a big oversimplification.

Obviously this tactic would not have worked if the point I had wanted to defend was that the particular statistical practice of correlation and regression used in the social sciences was valid.

But the whole point of this Eulering issue is that I am not a statistician. I should not be in the business of trying to defend regression unless I know enough about it to do so coherently and intelligently.

The problem here only occurs when sophisticated math is used to attack nonmathematical ideas, like the existence of God, or lead causing increases in crime. And presumably these ideas should be complicated and diverse enough that hopefully no one mathematical argument knocks down the entire edifice. True things should usually reveal their truth through multiple different arguments, and it would be very odd if math could demolish all of them at the same time.

I admit this is not a very satisfying solution to worries about Eulering. I don’t think there will be any general solution, but rather a toolkit of different useful tricks, some of which I will try to go into further in the future.