I figure the first topic on my blog should be my favorite. It concerns a question that is fundamental to computer science theory called the halting problem. It’s really not a fact about computers or computer programs at all; it’s just best formulated in that way. It was first solved in 1936 by Alan Turing, and one can argue that the basic question goes back at least to David Hilbert’s 23 problems in 1900 (although Hilbert initially assumed something that was false in his question). It was introduced to me by Professor Eugene Charniak a few years ago in his class on models of computation, and it’s a relatively standard problem that undergrads in computer science learn about at some point.

Most importantly, it’s really beautiful.

I’ll introduce the problem and prove the solution in this post. In a future post, I’ll discuss the wide-ranging consequences.

Anyone who’s used a computer knows that they don’t always behave as we’d like, and there are of course many reasons for that. For example, some programs that are expected to run for a short period and then stop end up running forever by accident. That can end up draining your computer’s resources or simply freezing it.

So, suppose you work at Microsoft, and your boss tells you that he wants you to write something into the next version of Windows that fixes this problem: Any time a program is run on Windows 8, the operating system should figure out whether it will stop eventually. (Maybe if it doesn’t stop eventually, it will give the user a warning.) He tells you that he doesn’t care how you do it, and you’ll have access to any of Microsoft’s resources.

Naturally, you start out by simply thinking about ways that a program might fail to stop. Sometimes this happens for a trivial reason. Maybe the programmer accidentally wrote code that looks something like this:

GO TO LINE 2 GO TO LINE 1

Obvously, that code will just loop forever. But, it’s also really easy to notice and fix. So, so far this task doesn’t look too bad.

But, what about something more complicated? What about this?

SET i = 1 DO THIS 43,112,609 TIMES: SET i = i * 2 SET i = i - 1 SET j = 2 DO UNTIL j = i - 1 : IF i IS DIVISIBLE BY j, GO TO LINE 1 SET j = j + 1

As it turns out, there isn’t any number between 2 and that divides . Indeed, is the largest publicly known prime number. So, this program actually doesn’t run forever. But, people only figured that out about 3.5 years ago. If I replace 43,112,609 with 179,424,673 (the 10 millionth prime, which happens to have a twin), no person currently knows whether the resulting program would ever stop running.

Maybe that’s a bit of an unfair example, though. The program contains a very large number in its source, and it doesn’t actually do anything useful. Plus, the question of whether that program ever terminates is obviously solvable by a computer in theory. The current largest prime was of course found by computer, so it’s not much of a stretch to ask a computer whether is prime. The progarm itself could even be thought of as a (naive) test for primality. In particular, your boss didn’t put any constraints on how long it would take to check whether or not a program runs forever, so maybe it’s ok if it takes years on some programs, especially on those with very large numbers in their source code.

Even worse, the potential loop is hard-coded into the program. If the program does loop, it only does so because it contains a statement that explicitly says “If ____, go back and do everything all over again without changing anything.” Of course programs like that might loop. Even ignoring that, when it does loop, it repeats the same state. If it loops, it will at some point be running a line of code that it had read before with all its variables set to the same values that they were set to some other time that it had read that line. Certainly, a loop checker could figure that out. (Of course, I could get around this issue by simply introducing some new variable, n, and incrementing n on each loop. That seems like cheating, though.)

But, what about this next program? (Admittedly, these programs are getting progressively more hand-wavy. That’s an attempt to make them understandable to laypeople. Even if you can’t understand what I’m trying to express in this code, the description below it should be manageable.)

SET n = 0 SET n = n + 1 SET i = n REMEMBER i IF i IS EVEN, SET i = i / 2 IF i IS ODD, SET i = (3 * i) + 1 IF i = 1, FORGET ALL NUMBERS AND GO TO LINE 2 OTHERWISE, IF i IS IN THE LIST OF REMEMBERED NUMBERS, OUTPUT n and STOP OTHERWISE, GO TO LINE 4

In words, what this program does is fairly simple: Given an even number n, it divides it by two. Given an odd number, it multiplies it by three and adds one. It repeats this process until it arrives at one or gets into a cycle (i.e. finds that this process will eventually take some number back to itself). If it arrives at one, it starts again with a higher n. If it finds a cycle (other than the obvious cycle of 1 -> 4 -> 2), it stops and outputs the number that got it into the cycle.

You can try some basic examples in your head. Starting at three:

(3 * 3) + 1 = 10 10 / 2 = 5 (5 * 3) + 1 = 16 16 / 2 = 8 8 / 2 = 4 4 / 2 = 2 2 / 1 = 1

If you keep trying examples, you’ll keep getting to one eventually, provided that you don’t make repeated mistakes. (Avoiding mistakes will be pretty hard, though. Starting at 27, for example. eventually gets you down to one, but it takes 111 steps to do so and climbs all the way up to 9,232 on the way.)

This doesn’t suffer from any of the draw-backs of my previous example. In particular, it doesn’t make reference to any number higher than three; it doesn’t have a loop that’s clearly hard-coded into it; it serves a very clear purpose; and it never repeats the same state twice.

So, does this program ever stop? If it does, then there must be some cycle, e.g. some starting point (other than 1, 2, or 4) such that repeated iterations of halving evens and multipying odds by three and adding one eventually gets you back to where you started. If it doesn’t, then either there is no such cycle or the program finds a number that leads to an unboundedly long sequence (i.e. some starting point such that repeated iterations of that function produces progressively larger numbers).

It turns out that nobody knows whether or not this very simple progarm will ever stop running. That’s not for lack of trying: Variants of this program have been run for every starting value up to 5,764,607,523,034,234,880, and so far the sequence always finds its way (slowly and clumsily) back to one. Indeed, the question of whether or not this program ever stops is (a slightly modified version of) the Collatz conjecture, a problem that mathematicians have struggled with (with almost no progress) for 75 years.

Once you encounter a seventy-five-year-old open problem in mathematics, you might consider whether you’re trying to do something that’s too hard.

In fact, what your boss has asked you to do is impossible.

Proving that it’s impossible is remarkably simple. I only need to point out three relatively simple facts first:

Computer programs that take some user-provided input (e.g. a program similar to my Collatz conjecture example except it lets the user specify the staring value) are trivially equivalent to programs that don’t take any input. Given any program that doesn’t take any input, I can make a program that does take input by simply having it ask the user for a number (or a word or a picture or a mouse movement, etc) and then promptly ignore it. Given any program that takes input together with some specific input from the user, I can create a new program that simply has the user’s input hard-coded in. (In other words, if a program contains a line “SET n = USER INPUT,” running that program with user input of seven is the same as replacing that line with “SET n = 7.”) Because of this equivalence, I’ll switch freely between programs that take input and programs that don’t take input. Computer programs themselves can be input for other programs. For example, the program that your boss asked you to run is a computer program that takes computer programs as input. More generally, any computer program is just a sequence of letters and/or numbers (0’s and 1’s if you like), and that’s a perfectly reasonable form of input. It’s this sort of behavior–computer programs talking about computer programs–that is at the heart of the halting problem. Computer programs can run other computer programs as subroutines. So, for example, computer program A might take some input and give some output. Computer program B might, as part of some longer process, simulate computer program A on some input and respond differently according to A’s output. There are many ways to show this, but perhaps the simplest is just to point out that a computer program can contain the code of another computer program as part of its own code. (In fact, much much more than this is true. A computer program can actually simulate another computer program that it’s taken as input. But, I don’t need that fact.)

So, suppose for the sake of argument that you have written a computer program that, given any other program and its input, will tell you if it runs forever. Call it H (for halting). Let’s say that H either outputs “Eventually stops” or “Runs forever” and stops after doing so. I’ll call using H to test whether program P halts on an input w “running H(P,w).” And, I’ll use the similar notation “running P(w)” to describe running some machine P on some input w. (As is standard in computer science, I’m going to abuse notation a bit.)

I’ll use H to create a new program, S (for stubborn). S takes some program P as input. Then, S runs H(P, P). (Remember that computer programs are perfectly valid input and that one program can run another as a subroutine.) If running H(P, P) outputs “Eventually stops,” then S stubbornly enters an infinite loop. (It might do this by simply mimicking the two-line program above, where each line simply leads to the other.) If H(P, P) outputs “Runs forever,” then S outputs the same and stops.

Now comes the obnoxious part: What happens when you feed S to itself as input?

Suppose S(S) outputs “Runs forever.” That only happens if H(S, S) does the same, and H(S, S) only outputs “Runs forever” if S(S) actually does run forever. But, we just assumed that S(S) outputs “Runs forever,” and when it does that, it stops.

So, S(S) doesn’t output “Runs forever.” S only ever does two things on any input, so S(S) must go into an infinite loop. But, if S(S) goes into an infinite loop, then H(S, S) must output “Runs forever” because, well, that’s what H does. But if H(S, S) outputs “Runs forever,” then by the definition of S, S(S) has to output that as well.

Annnndd, we’ve reached a contradiction.

So, it turns out that H simply can’t exist. In other words, there is no computer program that decides the halting problem–no program that will tell you decisively whether any computer program will stop or run forever.

At first, it might be hard to understand exactly what this means. Obviously, it’s sometimes possible to tell if a program will stop or not. Sometimes, you can tell easily by looking at the code (like in my first example). Sometimes, you can tell by looking it up on Wikipedia (like in my second example). If the program does eventually stop, then you can figure that out simply by running it and then waiting (though you might have to wait longer than the age of the universe or something similarly ridiculous in some cases).

Even in my Collatz conjecture example, it seems like there’s a big difference between not knowing the answer and not being able to know the answer. Even there, and in every example, there is an answer. Programs either run forever or they don’t; the two are mutually exclusive. In fact, if I want to write a program that solves the halting problem for simply one program on one input, that’s trivially simple. All I need to do is write one computer program that always says “Eventually stops” and another that always says “Runs forever.” One of them will be right, so one of them will solve the halting problem for that instance.

The problem is that no program can decide the halting problem in ALL instances. The above proof essentially points out that there are infinitely many kinds of loops that a program can find itself in. In some sense, there are infinitely many problems like the Collatz conjecture, and they don’t follow any real pattern–There’s no general solution to all of them. While you may in principle be able to solve each one of them separately, you can’t solve all together.

In some sense, this is pretty disappointing. The halting problem seems so incredibly simple, but it’s provably impossible to solve. This has huge practical implications as well: Microsoft really would like it if Windows 8 could figure out whether or not a program was going to run forever.

At the same time, though, this is a very nice proof that the abstract world–a world that many people consider to be extremely simple and mechanical–is wonderfully strange. Humans have built a world of abstract symbol manipulation and discovered that it, in some sense, it’s not just a world of abstract symbol manipulation. In fact, you can prove that inside that world!

As it turns out, the fact that the haling problem is unsolvable has wide-ranging consequences. In particular, the existence of just one problem that’s impossible to solve implies that lots of problems are impossible to solve. I’ll address this in a later post.