This morning on Twitter, Alexander Bogomolny posted a link to his article that gives examples of words that are prime numbers when interpreted as numbers in base 36. Some examples are “Brooklyn”, “paleontologist”, and “deodorant.” (Numbers in base 36 are written using 0, 1, 2, …, 9, A, B, C, …, Z as “digits.” )

Tim Hopper replied with a snippet of Mathematica code that lists all words with up to four letters that correspond to base 36 primes.

Rest[ Flatten[ Union[ DictionaryLookup /@ IntegerString[ Table[Prime[n], {n, 1, 300000}], 36]]]]

That made me wonder whether you could estimate how many such words there are without doing an exhaustive search.

The Prime Number Theorem says that the probability of a number less than N being prime is approximately 1/log(N). If we knew how many English words there were of a certain length, then we could guess that 1/log(N) of that those words would be prime when interpreted as base 36 numbers. This assumes that forming an English word and being prime have independent probabilities, which may be approximately true.

How well would our guess have worked on Tim’s example? He prints out all the words corresponding to the first 300,000 primes. The last of these primes is 4,256,233. The exact probability that a number less than that upper limit is prime is then

300,000 / 4,256,233 ≈ 0.07.

There are about 4200 English words with four or fewer letters. (I found this out by running

grep -ciE '^[a-z]{1,4}$'

on the words file on a Linux box. See similar tricks here.) If we estimate that 7% of these are prime, we’d expect 294 words from Tim’s program. His program produces 275 words, so our prediction is pretty good.

If we didn’t know the exact probability of a number in our range being prime, we could have estimated the probability at

1/log(4,256,233) ≈ 0.0655

using the Prime Number Theorem. Using this approximation we’d estimate 4200*0.0655 = 275.1 words; our estimate would be exactly correct! There’s good reason to believe our estimate would be reasonably close, but we got lucky to get this close.

More prime number posts