In computation, universality simply means a process that can simulate all processes — including itself. By simulation, we mean copying the behavior of a process to as much fidelity as we would like. At some point, if it looks like a duck, quacks like a duck, and walks like a duck, we stop, and consider it a duck for all practical purposes. (There, I wrapped the Turing test for artificial general intelligence in a nutshell for you.) Replace “processes” with “machines,” and you roughly see how computers work: a universal machine is a machine that can simulate all machines, including itself. You can think of a machine simply as a process that transforms an input to an output following a fixed set of rules.

So, this is why iPhones and Androids, or laptops and supercomputers can essentially run the same software, despite their superficial differences in hardware. This is also why you can simulate Windows inside a virtual machine on your Mac, play old video games from Atari on Intel, or mine Bitcoin on a half-a-century-old IBM mainframe. (Do you see in which part of the definition of the universality do virtual machines naturally arise?)

Alan Turing hit upon this critical observation when he designed his machines, which are a precursor to the computer you are reading this on today. When he first devised what we now call Turing machines, he started with the simplest thing to do: one machine to compute specifically, say, any real number you cared about, and this is all that machine could ever do. (Real numbers include integers, fractions, and irrational numbers.) The problem is that there is a countably infinite number of such machines, one for every computable real number. (Countably infinite just means as big as all of the counting numbers you learned in kindergarten.) He quickly realized that if we needed a different smartphone for every app you wanted to run, they wouldn’t be, well, so smart at all, and Apple certainly wouldn’t be as rich as they are now. (I remember when I was a kid, we had different machines for watching movies, listening to music, and playing video games. Now, we have machines that can do all of that, and make espresso for you. Try traveling back in time and telling that to kids in the 80s.) So, being the genius that he was, he also saw the “simple” solution to the problem: one machine to do what anything else, including itself, can do, so long as it had all the space and time in the world. This a really great idea in so many ways, which is why I believe computers wouldn’t be in production today if it wasn’t for how practical Turing was: he fixed his own bicycles, you know.

Really, you can think of a universal machine as this guy:

Basically how programming works, as anyone in the industry can attest: “keep talking and nobody explodes.”

This is basically what we mean by programming a computer. We are giving a really dumb but universal machine a tedious and explicit series of instructions — software — by which it can now pretend to be another machine. (You see how software really just encodes some hardware now?) The guy, or equivalently, the computer, may not know how to, say, defuse a bomb, but you just tell him precisely what to do. The great Richard Feynman has a brilliant analogy for how universality works in terms of how fast but dumb clerks who know how to only, say, add numbers can yet simulate slow but smart clerks who know how to add and multiply them. (No offense to clerks everywhere.) An interesting exercise is to find the smallest number of instructions (e.g., move data from one place to another) that are Turing-complete; i.e., you can use them to build a universal machine.

Feynman’s lektchur to a bunch of hippies on how computers work.

You can also think of universality in terms of translation between different languages. You might have heard of various programming languages such as C, Python, Javascript, Go, and so on. Superficially, they all seem very different on the surface. However, while some programming languages are nicer than others for expressing some things (for roughly the same reason some types of cursing feel more satisfactory in some human languages than others), there is no language that is computationally more powerful than another. In other words, there is no programming language that can express something that another cannot. Every useful programming language is universal. Why? Well, think of the universal machine as a translator or interpreter. You can always take a Python program and translate it into an equivalent Go program, and vice versa. This is a useful exercise for the programmers among you.

Universality may sound simple and obvious in hindsight, yet it is anything but. Roughly stated, the Church-Turing thesis postulates that any type of computer you can come up with is computationally just as powerful as those slow, clunky, and deterministic Turing machines. That is, the Turing machine can do exactly everything the other computer can do. (This is what Turing and Alonzo Church found out about the lambda calculus, an independently-developed and much less intuitive model of computation that inspired the LISP family of programming languages.) Note that we don’t care about efficiency here, or how long the Turing machine takes to compute the same thing, although there are variants of the thesis that do care about it.

The Church-Turing thesis does not have a proof, and may not even be provable, but it can certainly be easily disproved. Nevertheless, it has been nearly a century since it was formulated, and a working counterexample has yet to be found. Even though quantum computers are suspected to be faster for some problems (e.g., factoring numbers, which can break some types of cryptography), they are not thought to be more powerful in the sense above. I don’t think anybody really knows why we can’t seem to do any better right now. It could be a temporary limitation due to gaps in knowledge or technology, or, more intriguingly, a fundamental limit imposed by Nature.