This story is also available on Kindle!

You remember prime numbers, right? Those numbers you can’t divide into other numbers, except when you divide them by themselves or 1? Right. Here is a 3000 year old question:

2, 3, 5, 7, 11, 13, 17, 19, 23, 29, p. What is p? 31. What is the next p? It’s 37. The p after that? 41. And then? 43. How, but… …how do you know what comes next?

Present an argument or formula which (even barely) predicts what the next prime number will be (in any given sequence of numbers), and your name will be forever linked to one of the greatest achievements of the human mind, akin to Newton, Einstein and Gödel. Figure out why the primes act as they do, and you will never have to do anything else, ever again.

Introduction

The properties of the prime numbers have been studied by many of history’s mathematical giants. From the first proof of the infinity of the primes by Euclid, to Euler’s product formula which connected the prime numbers to the zeta function. From Gauss and Legendre’s formulation of the prime number theorem to its proof by Hadamard and de la Vallée Poussin. Bernhard Riemann still reigns as the mathematician who made the single biggest breakthrough in prime number theory. His work, all contained in an 8 page paper published in 1859 made new and previously unknown discoveries about the distribution of the primes and is to this day considered to be one of the most important papers in number theory.

Since its publication, Riemann’s paper has been the main focus of prime number theory and was indeed the main reason for the proof of something called the prime number theorem in 1896. Since then several new proofs have been found, including elementary proofs by Selberg and Erdós. Riemann’s hypothesis about the roots of the zeta function however, remains a mystery.

How many primes are there?

Let’s start off easy. We all know that a number is either prime or composite. All composite numbers are made up of, and can be broken down (factorized) into a product (a x b) of prime numbers. Prime numbers are in this way the “building blocks” or “fundamental elements” of numbers. They were proven to be infinite in number by Euclid, 300 years BCE. His elegant proof goes as follows:

Euclid’s theorem Assume that the set of prime numbers is not infinite. Make a list of all the primes. Next, let P be the product of all the primes in the list (multiply all the primes in the list). Add 1 to the resulting number, Q = P +1. As with all numbers, this number Q has to be either prime or composite: - If Q is prime, you’ve found a prime that was not in your “list of all the primes”. - If Q is not prime, it is composite, i.e made up of prime numbers, one of which, p, would divide Q (since all composite numbers are products of prime numbers). Every prime p that makes up P obviously divides P. If p divides both P and Q, then it would have to also divide the difference between the two, which is 1. No prime number divides 1, and so the number p cannot be on your list, another contradiction that your list contains all prime numbers.

There will always be another prime p not on the list which divides Q. Therefore there must be infinitely many primes numbers.

Why are primes so hard to understand?

The mere fact that any novice understands the problem I laid out above, speaks volumes about how difficult it is. Even the arithmetic properties of primes, while heavily studied, are still poorly understood. The scientific community is so confident in our lacking ability to understand how prime numbers behave that the factorization of large numbers (figuring out which two primes multiply together to make a number) is one of the the very foundations of encryption theory. Here’s one way of looking at it:

We understand composite numbers well. Those are all the non-primes. They are made up of primes, but you can easily write a formula that predicts and/or generates composites. Such a “composite filter” is called a sieve. The most famous example is named the “Sieve of Eratosthenes” from c. 200 BCE. What it does, is simply mark the multiples of each prime up to a set limit. So, take the prime 2, and mark 4,6,8,10 and so on. Next, take 3, and mark 6,9,12,15 and so on. What you’ll be left with is only primes. Although very simple to understand, the sieve of Erathosthenes is as you can imagine, not very efficient.

One function simplifying your work significantly is 6n +/- 1. This simple function spits out all primes except 2 and 3, and removes all multiples of 3 and all even numbers. Put in for n = 1,2,3,4,5,6,7 and behold the result: 5,7,11,13,17,19,23,25,29,31,35,37,41,43. The only non-prime numbers generated by the function are 25 and 35, which can be factorized into 5 x 5 and 5 x 7, respectively. The next non-primes are, as you can imagine, 49 = 7 x 7, 55 = 5 x 11 and so on. Simple right?

Illustrating this visually, I’ve used something that I’m calling “composite ladders”, a simple way to see how the composite numbers generated by the function are laid out for each prime, and combined. In the first three columns of the image below, you neatly see the prime numbers 5, 7 and 11 with each respective composite ladder up to and including 91. The chaos of the fourth column, showing how the sieve has removed all but the prime numbers, is a fair illustration of why prime numbers are so hard to understand.