There has been a lot of recent interest in the abc conjecture, since the release a few weeks ago of the last of a series of papers by Shinichi Mochizuki which, as one of its major applications, claims to establish this conjecture. It’s still far too early to judge whether this proof is likely to be correct or not (the entire argument encompasses several hundred pages of argument, mostly in the area of anabelian geometry, which very few mathematicians are expert in, to the extent that we still do not even have a full outline of the proof strategy yet), and I don’t have anything substantial to add to the existing discussion around that conjecture. (But, for those that are interested, the Polymath wiki page on the ABC conjecture has collected most of the links to that discussion, and to various background materials.)

In the meantime, though, I thought I might give the standard probabilistic heuristic argument that explains why we expect the ABC conjecture to be true. The underlying heuristic is a common one, used throughout number theory, and it can be summarised as follows:

Heuristic 1 (Probabilistic heuristic) Even though number theory is a deterministic subject (one does not need to roll any dice to factorise a number, or figure out if a number is prime), one expects to get a good asymptotic prediction for the answers to many number-theoretic questions by pretending that various number-theoretic assertions (e.g. that a given number is prime) are probabilistic events (with a probability that can vary between and ) rather than deterministic events (that are either always true or always false). Furthermore: (Basic heuristic) If two or more of these heuristically probabilistic events have no obvious reason to be strongly correlated to each other, then we should expect them to behave as if they were (jointly) independent.

(Advanced heuristic) If two or more of these heuristically probabilistic events have some obvious correlation between them, but no further correlations are suspected, then we should expect them to behave as if they were conditionally independent, relative to whatever data is causing the correlation.

This is, of course, an extremely vague and completely non-rigorous heuristic, requiring (among other things) a subjective and ad hoc determination of what an “obvious reason” is, but in practice it tends to give remarkably plausible predictions, some fraction of which can in fact be backed up by rigorous argument (although in many cases, the actual argument has almost nothing in common with the probabilistic heuristic). A famous special case of this heuristic is the Cramér random model for the primes, but this is not the only such instance for that heuristic.

To give the most precise predictions, one should use the advanced heuristic in Heuristic 1, but this can be somewhat complicated to execute, and so we shall focus instead on the predictions given by the basic heuristic (thus ignoring the presence of some number-theoretic correlations), which tends to give predictions that are quantitatively inaccurate but still reasonably good at the qualitative level.

Here is a basic “corollary” of Heuristic 1:

Heuristic 2 (Heuristic Borel-Cantelli) Suppose one has a sequence of number-theoretic statements, which we heuristically interpet as probabilistic events with probabilities . Suppose also that we know of no obvious reason for these events to have much of a correlation with each other. Then: If , we expect only finitely many of the statements to be true. (And if is much smaller than , we in fact expect none of the to be true.)

, we expect only finitely many of the statements to be true. (And if is much smaller than , we in fact expect none of the to be true.) If , we expect infinitely many of the statements to be true.

This heuristic is motivated both by the Borel-Cantelli lemma, and by the standard probabilistic computation that if one is given jointly independent, and genuinely probabilistic, events with , then one almost surely has an infinite number of the occuring.

Before we get to the ABC conjecture, let us give two simpler (and well known) demonstrations of these heuristics in action:

Example 1 (Twin prime conjecture) One can heuristically justify the twin prime conjecture as follows. Using the prime number theorem, one can heuristically assign a probability of to the event that any given large integer is prime. In particular, the probability that is prime will then be . Making the assumption that there are no strong correlations between these events, we are led to the prediction that the probability that and are simultaneously prime is . Since , the Borel-Cantelli heuristic then predicts that there should be infinitely many twin primes. Note that the above argument is a bit too naive, because there are some non-trivial correlations between the primality of and the primality of . Most obviously, if is prime, this greatly increases the probability that is odd, which implies that is odd, which then elevates the probability that is prime. A bit more subtly, if is prime, then is likely to avoid the residue class , which means that avoids the residue class , which ends up decreasing the probability that is prime. However, there is a standard way to correct for these local correlations; see for instance in this previous blog post. As it turns out, these local correlations ultimately alter the prediction for the asymptotic density of twin primes by a constant factor (the twin prime constant), but do not affect the qualitative prediction of there being infinitely many twin primes.

Example 2 (Fermat’s last theorem) Let us now heuristically count the number of solutions to for various and natural numbers (which we can reduce to be coprime if desired). We recast this (in the spirit of the ABC conjecture) as , where are powers. The number of powers up to any given number is about , so heuristically any given natural number has a probability about of being an power. If we make the naive assumption that (in the coprime case at least) there is no strong correlation between the events that is an power, is an power, and being an power, then for typical , the probability that are all simultaneously powers would then be . For fixed , the total number of solutions to the Fermat equation would then be predicted to be (Strictly speaking, we need to restrict to the coprime case, but given that a positive density of pairs of integers are coprime, it should not affect the qualitative conclusion significantly if we now omit this restriction.) It might not be immediately obvious as to whether this sum converges or diverges, but (as is often the case with these sorts of unsigned sums) one can clarify the situation by dyadic decomposition. Suppose for instance that we consider the portion of the sum where lies between and . Then this portion of the sum can be controlled by which simplifies to Summing in , one thus expects infinitely many solutions for , only finitely many solutions for (indeed, a refinement of this argument shows that one expects only finitely many solutions even if one considers all at once), and a borderline prediction of there being a barely infinite number of solutions when . Here is of course a place where a naive application of the probabilistic heuristic breaks down; there is enough arithmetic structure in the equation that the naive probabilistic prediction ends up being an inaccurate model. Indeed, while this heuristic suggests that a typical homogeneous cubic should have a logarithmic number of integer solutions of a given height , it turns out that some homogeneous cubics (namely, those associated to elliptic curves of positive rank) end up with the bulk of these solutions, while other homogeneous cubics (including those associated to elliptic curves of zero rank, including the Fermat curve ) only get finitely many solutions. The reasons for this are subtle, but certainly the high degree of arithmetic structure present in an elliptic curve (starting with the elliptic curve group law which allows one to generate new solutions from old ones, and which also can be used to exclude solutions to via the method of descent) is a major contributing factor.

Below the fold, we apply similar heuristics to suggest the truth of the ABC conjecture.

— 1. The ABC conjecture —

The ABC conjecture asserts that for every , one has

whenever are coprime natural numbers, and is the product of all the primes dividing . Since are coprime, , and an equivalent formulation of the conjecture is as follows: if are real numbers with , then for sufficiently large , there does not exist any solution to the equation with (say), coprime, and

Indeed, one can deduce the former version of the ABC conjecture from the latter by using a finite mesh of triples with spacing equal to small multiple of ; we leave the details to the interested reader.

We can now try to randomly find counterexamples to the abc conjecture as follows:

Pick a large (say, a power of two). Pick coprime squarefree numbers , , . Pick numbers with with comparable to . Check if .

Steps 1, 2, and 4 are easy to apply probabilistic heuristics to. For Step 3, we need the following lemma, reminiscent of the classical divisor bound:

Lemma 3 Let be a square-free integer. Then there are at most integers less than with radical .

Proof: Factorise . We are trying to establish the bound , where is the number of numbers less than which are divisible by but by no other primes, that is to say they lie in the set . To do this, it is convenient to work with Dirichlet series, and specifically with the sum

for some parameter to be optimised in later. On the one hand, this expression is at least . On the other hand, we have the factorisation

We crudely bound

for some depending only on , using the trivial bound and the geometric series formula, leading to

On the other hand, since

we see from the prime theorem (which gives a lower bound on the sum of the logarithms of the first primes, which in turn lower bounds ) that

and so

for any fixed ; as can be arbitrarily small, the claim follows.

Now we can run the probabilistic heuristics. After selecting as a large power of two in Step 1, we have choices for in Step 2. By the above lemma, for each , there are choices for that are of magnitude , and assuming (in the case when are coprime) that there is no prior correlation between and , which we think of as being randomly distributed amongst the numbers of size , the probability that should be of order . This leads to a total probability of ; since , this sum is convergent (for ranging over powers of ), so we expect only finitely many counterexamples, thus supporting the ABC conjecture.

Remark 1 In the case of , the naive heuristic prediction was incorrect, basically because of the algebraic structure in the Fermat curve (which, among other things, is an elliptic curve and thus enjoys a group law). In the case of the general equation, though, no analogous algebraic structure appears to be present. So there is no obvious correlation here. (Of course, this does not rule out the possibility of a much less obvious correlation; this is one of the reasons why the above arguments are only heuristic, and fall well short of a rigorous proof of anything.)