In a few hours, on the 18th of January, a lot of sites are going to “go dark” to protest SOPA, the latest in a series of draconian, badly-thought-out laws to regulate the internet and computing in general. This is probably old news to you, and I’m going to spare you another explanation of why this particular bill is so bad. Instead, I want to talk about in general, why programming knowledge is so strongly correlated with getting upset about things like this. It’s kind of a long story, and it starts in ancient Greece.

For what it’s worth, this site won’t go dark. I think it’s more important to leave it up so you can read this post.



Diophantine problems

Between 150 BC and 280 CE lived a man named Diophantus. He had a son, who died young, and to distract him from his grief he turned to mathematics. Ancient Greek mathematics wasn’t as sophisticated as ours, and so the books of puzzles that he wrote had to have a particular property: there must solutions that are integers. Today these are called Diophantine equations. One might be, “find two perfect squares whose sum is also a perfect square”, and a solution would be 9 and 16: their sum is 25, which is the square of 5. There are also problems that do not work, like “find two perfect cubes whose sum is also a perfect cube”. There are no solutions to that in the integers.

In 1900, a mathematician named David Hilbert laid out 23 problems called the Hilbert Program: these were problems that needed to be solved in order to prove that mathematics was based on correct, sound logic. One of the problems was to determine an algorithm by which a computer (computers didn’t exist yet, but that was the idea) could look at a given Diophantine equation and determine whether or not it had an integer solution. In Hilbert’s mind this wasn’t under question; the algorithm had to exist because every question had to either have an answer or not, and it was only needed to find the algorithm as a tool to make bigger and better things.

Decidability

Two people solved the problem (the Entscheidungsproblem), Alonzo Church at Princeton and a British student at Cambridge named Alan Turing, both in 1931. Both of them got the same answer, which was that such an algorithm couldn’t possibly exist, but Turing’s proof had an interesting corollary about computing in general:

His proof, which I’m not going to rehash, first invented a machine that he called the “universal machine” (and that everyone else called the Turing machine), and showed that it could perform any calculation (that is, any algorithm that any computer, even the one you’re reading this on eighty years later, can do, a Turing machine can as well). Then he showed that if the Entscheidungsproblem had a solution, then it would be possible to look at a given Turing machine and predict what it will do without actually running it. Then he showed that that was completely impossible.

It’s easy to see why: if you had a program (call it an analyzer) that could look at any other program and predict what it will do, then you could write another program like “if the Collatz conjecture is false then exit, otherwise infinite loop”, feed that to the analyzer, and answer any open question in mathematics. So, Turing’s halting problem has no solution, so mathematics contains paradoxes, so we have to work to figure out Diophantus’ puzzles. Not a big deal.

Enter DRM

Let’s say you make codes. Or you sell encrypted data. Or you sell anything, actually, that a program can copy. It would be very nice if you could ensure that no computer anywhere could run any program that could copy your data. If that were the case, then you could price your things based on scarcity: if you only make a thousand copies then only a thousand people will get to have a copy, and you can decide which thousand by charging them money for it. The worst possible case for you is that you sell one copy to one person, who then copies it for free to anyone else who wants it.

Whether or not this will actually happen is still up for debate. Some evidence says that people don’t work that way, and most people will pay you regardless, but that’s sort of irrelevant for this.

There are, in general two ways to do this: one is to make a computer that won’t run bad programs (where “bad” is “anything that breaks your encryption”), and the other is to make a computer that will only run good programs. This is not the same thing. The first kind of computer will run any program by default unless it can show that it does something bad. The second kind needs someone’s permission to run any new program, and not the permission of whoever actually owns the computer. It’s the difference between a blacklist and a whitelist.

When I was in college, there was a website called goatse.cx, and it was a common prank to trick people into visiting it. See, it contained a really disgusting picture. So one of my roommates suggested that we use our router to blacklist that website so that none of us could be tricked into looking at it. This was so obviously a good idea that we did it immediately. We were willingly giving up part of our freedom, sure, but we knew exactly what we were giving up and it was something that we didn’t want. Most programmers are that way about blacklists. Antivirus programs are blacklists and they aren’t offensive the way things like the iOS App Store is.

The iOS App Store is a whitelist: if you want to run any program on an iOS device, even a program you wrote on a device you own, you have to first get Apple’s permission (which costs money, but even if it were free it would be a problem). If you want to distribute your programs to anyone else, or use a program someone else wrote, you have to do it through Apple and you can only do it with programs they’ve approved. It’s a whitelist. Because of it, iOS has effectively zero malware as a platform, but most programmers still don’t like it because Apple’s decisions about their whitelist are less than perfect: there are dozens of stories of programs that users want being rejected, removed from the store after the fact, their authors being ordered to remove features, and so on.

A computer that would run anything except for illegal algorithms wouldn’t be as good as a computer that could run anything, but it would be less scary than a whitelist. Trouble is, it’s impossible: Turing’s proof shows that you can’t tell what a program is going to do unless you run it and find out. A blacklist can’t be built, so, when people try to police computing, they end up making whitelists.

Unintended consequences

Here’s the thing about whitelists: Apple may have made theirs to prevent malware and protect users, but given that power, their goals inevitably shifted. Anyone’s would. That’s why we don’t trust whitelists: they’re oppressive by their core nature.

And so we don’t trust DRM or laws like the DMCA, or SOPA: whitelists are oppressive, and blacklists are impossible. The people drafting these things may start from the purest of goals, and actually probably do, but they only have one method for enforcing their rules, and that method inevitably corrupts. It’s something that programmers and computer scientists have known since 1931. We aren’t willing to make that trade, and never will be.