My search for the Quantum Hitler Scale (QHS) began after positing the Quantum Hitler Principle (QHP): Any article not explicitly about the development of modern physics during the mid-twentieth century that contains the words “Hitler” and “quantum” is probably bullshit.

This by itself is pretty unassailable, though rife with obvious exceptions. Science fiction reviews and wikipedia entries about a sufficient chunk of the Marvel universe could be laudable academic work yet fail the test. In a nod to Gödel, this very article fails the test spectacularly, and I admire the snideness of anyone who stopped reading at the end of the first sentence. If this ends up on reddit I’ll upvote every comment to that effect.

I picked these two words after several years of delving into the depressing online world of pseudoscience, astrology, snake oil, anti-vaccination, and similar nonsense. I don’t recommend it. It’s a dark, dark place, and you have to be careful to periodically track down legitimate science, or your head starts to crack as you see words misappropriated for fallacy after fallacy and logic unravels upon the frequent left turns into tangential minutiae. It’s not just Alex Jones spouting Illuminati nonsense; nice, thoughtful people write things that sound an awful lot like science, and they can sound like voices of reason when real scientists get so fed up they start sounding like the forum trolls whose favorite word is “sheeple.”

“Hitler” is an obvious choice, following from Godwin’s law: “As an online discussion grows longer, the probability of a comparison involving Nazis or Hitler approaches 1.” It’s a standard rule of thumb that once you draw a parallel to Hitler, you’ve lost the argument, or you’re crazy, although when I told a friend no one is like Hitler, he mentioned a few post-Holocaust genocides, so I had to amend the statement. More accurate: nobody is like Hitler unless they are the primary populist leader of a state-sanctioned, active, successful, and not-at-all secret effort to eliminate another race.

“Quantum” is a good choice because a surprisingly small number of people actually know what it means. Roughly, it means “smallest amount of energy.” Quantum mechanics is about studying tiny transactions. A bunch of extremely weird things happen when you’re looking at very small amounts of energy, and these things are difficult to communicate to people in a sensible way, so the general public hears “anything can happen!” and, well, sort of, but no, not exactly, and even the cool anythings that very improbably might happen someday, if the math is right, aren’t going to happen to anybody we know anytime soon.

I stand by these selections and QHP, but a principle is not enough. No, after reading thousands of articles of alluring bullshit promising me oneness with Earth, good health, eternal life, and infinite persecution by secret kabbalahs bent on world domination and keeping me from knowing the truth about vitamin C, I thought, “Why can’t my computer sort this out for me?” I wanted a program that I could plug in that would tell me if the author of an article was bulldozing bullshit, and I wanted to know how much and how fast. I wanted a Quantum Hitler Scale.

Early analysis

I cobbled together a really bad python script to start analyzing articles via the time-honored blindfold and shotgun method. Initially I just measured the frequency of punctuation, connective words, pronouns, and a handful of other things. I used two control texts: the original bitcoin paper, which is a paper about how you can use cryptography and computers to build a proof-of-transaction system, and the first page of timecube, which is a schizophrenic rant about something involving rotation, corners, and racism. I figured any text analysis program that couldn’t tell these two things apart was a completely useless program.

My program was completely useless. My chosen measures produced surprisingly consistent ratios between the two. If I plugged in “evil” for one of my judgement call words, the difference was clear, but I felt that was cheating. I think I must have been assuming timecube was closer to Lyndon LaRouche’s rhetoric than it actually is, since “Hitler” and even “Nazi” came up blank. I found 7 instances of “Jew,” but once you’re blaming Jews, you can be talking about anything, since it’s well established that Jews are responsible for everything from 9/11 to male-pattern baldness. You could probably find someone who thinks Jews killed the dinosaurs.

I needed some more articles. I added some fear-mongering articles from Natural News, and searched (hard) for some dry science reporting. I knew this was going to be rough going, as the scale I was looking for ran from “honest science or social and scientific reporting” to “not honest or just misinformed reporting about something,” and you could argue—correctly—that’s not a one-dimensional scale, but I spent a thousand dollars on this computer and I’m going to do something with it.

Results were still inconclusive. No matter how ugly I made my script, it kept reporting no statistical difference between any of these papers. At this point, my data collection was so small that something should have popped up just by the nature of data analysis: a small number of random samples should produce more variation than a large number of random samples. My choices for things to look for were apparently too neutral to produce statistical variation between racist psychotics off their meds and mathematicians describing cryptographic currency. In a broader view, that might not be surprising, but not what I was going for.

Given my hypothesis (“I’m bored and need an essay topic”), I knew I needed better tools to give me something to look at. I downloaded nltk (natural language toolkit), and did some simple analysis measuring frequency of parts of speech. Here I noticed a small uptick in the use of proper nouns in the rants vs. the science, but not enough to pursue. Clearly, my methods were not up to my needs. I had to take the next step: pestering other people for ideas.

Bigger guns

I started mentioning what I was trying to do to a few friends, and most of them immediately went home to attempt to write something academically legitimate that had the words quantum and Hitler in it. More helpfully, one friend mentioned a Bayesian classifier. This is what runs a lot of spam filters: you tag a bunch of email as spam, and it analyzes all the text in them, then compares it to the same analysis of the email you don’t tag as spam. I initially balked at this, since even though it was technically machine learning, I was now the teacher inflicting many kilobytes of judgement on my silicon student. Still, I didn’t have much to go on, and when I did a little research, I noticed that the basic python Bayesian classifier had the nice feature of telling you what it’s basing its judgements on.

I’m slightly more willing to share the script I wrote to run some simple classification, so here it is.

The most important thing I learned in this process is that nothing is more pythonic than an ascii error. Since I was copying things straight from web pages, these errors were legion and I didn’t want to muck around with encoding and decoding the way I do in every other python project I work on, so I rage-googled for a while and put together this script, which mass converts text files into something python’s pitifully sensitive stomach can digest.

Then I started filling up on text. Info Wars, Natural News, homeopathy, birther, truther, and global warming conspiracy articles filled up my bullshit folder. The legit folder was mostly dry, mild science reporting about fish migration patterns and similarly unexciting data analysis. After reading these to determine what’s what, my Bayes script then had to correctly identify four articles as either legitimate science reporting or bullshit: the bitcoin paper, timecube, a meta-analysis of acupuncture trials, and something by Deepak Chopra.

Initially, everything came up bullshit. This was troubling, but I figured it was because I was much more invested in filling up my bullshit folder so it was naturally going to get more hits. More tinkering and adding another test text about the problem with alternative medicine gave me some sensible results, given the nature of the data. Timecube stayed bullshit throughout the rest of my trials, and the alt-med condemnation article stayed legit. Fittingly, the bitcoin paper fluctuated wildly.

Deepak stayed legit no matter how much Info Wars I shoved into the bullshit folder. This seemed fair: Deepak doesn’t usually run around saying whoever is in power is setting up death camps. I went the other route, and added a bunch of Steven Pinker papers to the legit folder. This gave the bitcoin paper legitimacy again, but Deepak remained unswayed. I added some Stephen Jay Gould papers and a fair critique of Gould’s work. Nothing. I know I’m picking on Deepak, but he’s just so pick-on-able. He can defend himself.

I did notice that in my grab-random-article methodology, the words the classifier was tagging as indicators of legitimacy versus bullshit were becoming simpler and simpler; at one point the capitalized word “More” was the single most important indicator of bullshit by a margin, followed by “Scientists.” I may someday read the hundreds of thousands of words in these articles to see what was going on with that, but I doubt it.

More nature-will-fix-it-all text went into bullshit, this time biophotons, earthing, and something about magnetic North and chakras. It did nothing but reverse the ruling on bitcoin Pinker fought so hard for. I pumped multiple, lengthy defenses of astrology into bullshit. Nothing.

You win this one, Chopra.

Conclusions

Though timecube and bitcoin occupy pretty disparate places on the spectrum of rationality, they were probably too different for my purposes, and too similar in that 90 percent of the population would look at either and say, “What?”

I wanted a bullshit detector that would give me an unarguable location on QHS, and the importance of finding such a detector is in its ability to distinguish between things that may be very hard for a human mind to distinguish.

My initial intent to try to keep my judgement out of it was a logically contradictory gut impulse: this entire project was about trying to turn my judgement into a computer program, and if I kept putting articles into a classifier for the next few months, something would eventually emerge that resembles my opinions. I think my judgement has some validity because of course I do, but also because I’m slow to exercise it. I call this critical thinking, but critical thinking is not a set of skills that resides above and apart from the context of what you are trying to apply them to: it’s something you can do when you have a thorough knowledge of the subject.

If you know nothing about linux, you won’t know to question my words if I tell you to type “chown -R mysql /” into your command line. You’ll just do it, because the command has no meaning for you and you’re deferring to my judgement as an expert, and I will laugh at you, because I am cruel. This is why people shouldn’t trust programmers, but most of the time they have no other choice. When they have a choice, they apply the critical thinking tools developed for the environment they live in to vague summaries from fields of study they know nothing about, and the results are mostly disastrous. It’s not that the classifier won’t work, it’s that if it’s trained on nothing but conservative and liberal rhetoric, it will be very good in distinguishing between the two, but totally useless in figuring out whether a research paper got its math right. John Doe’s critical thinking skills, developed over a lifetime of practice in a chemistry lab, aren’t the best skills to apply to economic policy decisions.

Without a decent background in physics and chemistry, you won’t necessarily know that water molecules don’t have a record of the things that have been near them and you can’t get more “natural” electrons. Meanwhile, you have to listen to people say particles do different things depending on whether or not they’re being observed and the universe might be a hologram. You need a context and a body of knowledge to be critical about something, and even when you’re done with that, you can put a lot of things together that sound good and have it lead to Hitler. I once argued that you could do any job between a phone book and google, and it only occurred to me in an embarrassing moment five months later that I was making this argument to an emergency room doctor, who did not have time to ruminate with wikipedia when his first approach didn’t stop the patient from bleeding out. Everybody learns the things they’re required to know on the spot if they’re put on it often enough. Outside that spot, nobody has enough time to become widely versed in the myriad disciplines one would have to know to authoritatively refute bigfoot as a valid hypothesis or Nickleback as music. At some point, we all have to trust somebody else’s explanation, and some things you even have to take on faith to get through an average day with any degree of efficiency. “Scientists” usually have a better chance of getting my attention than “sources,” or Fox’s ominous “they,” but I don’t know these scientists, and even if I did, I don’t know the science.

Though I stand by QHP, I didn’t find a reliable measure of QHS, or, to be honest, really decide what it was measuring. A better and more patient programmer or scientist may find it. It will take a lot of guesses and judgement calls, and maybe the effort will never solve the problem, but can’t hurt to experiment. More important, it will only help to admit ignorance in the face of difficult subject matter. The implicit corollary of QHP is if people are screaming at a problem or invoking often misunderstood terms, you should withhold judgment until you’ve done your own research.

For the record, here are final results of the Bayesian classifier:

Words that suggest an article is bullshit, in order of the strength of indication, and case-sensitive, based on my own opinions of articles I found on the internet: entire, truth, No, upon, You, head, required, sources, widely, doesn’t, John, explanation, needs, step, 11, exactly, North, added, defend, completely, word, faith, willing, mentioned, 7, practice, again, thinks, attempt, multiple, meaning, established, dark.

Words that suggest an article is legit, in order of the strength of indication, and case-sensitive, based on my own opinions of articles I found on the internet: there.