Experienced mathematicians warn up-and-comers to stay away from the Collatz conjecture. It’s a siren song, they say: Fall under its trance and you may never do meaningful work again.

The Collatz conjecture is quite possibly the simplest unsolved problem in mathematics — which is exactly what makes it so treacherously alluring.

“This is a really dangerous problem. People become obsessed with it and it really is impossible,” said Jeffrey Lagarias, a mathematician at the University of Michigan and an expert on the Collatz conjecture.

Earlier this year one of the top mathematicians in the world dared to confront the problem — and came away with one of the most significant results on the Collatz conjecture in decades.

On September 8, Terence Tao posted a proof showing that — at the very least — the Collatz conjecture is “almost” true for “almost” all numbers. While Tao’s result is not a full proof of the conjecture, it is a major advance on a problem that doesn’t give up its secrets easily.

“I wasn’t expecting to solve this problem completely,” said Tao, a mathematician at the University of California, Los Angeles. “But what I did was more than I expected.”

The Collatz Conundrum

Lothar Collatz likely posed the eponymous conjecture in the 1930s. The problem sounds like a party trick. Pick a number, any number. If it’s odd, multiply it by 3 and add 1. If it’s even, divide it by 2. Now you have a new number. Apply the same rules to the new number. The conjecture is about what happens as you keep repeating the process.

Intuition might suggest that the number you start with affects the number you end up with. Maybe some numbers eventually spiral all the way down to 1. Maybe others go marching off to infinity.

But Collatz predicted that’s not the case. He conjectured that if you start with a positive whole number and run this process long enough, all starting values will lead to 1. And once you hit 1, the rules of the Collatz conjecture confine you to a loop: 1, 4, 2, 1, 4, 2, 1, on and on forever.

Over the years, many problem solvers have been drawn to the beguiling simplicity of the Collatz conjecture, or the “3x + 1 problem,” as it’s also known. Mathematicians have tested quintillions of examples (that’s 18 zeros) without finding a single exception to Collatz’s prediction. You can even try a few examples yourself with any of the many “Collatz calculators” online. The internet is awash in unfounded amateur proofs that claim to have resolved the problem one way or the other.

“You just need to know multiplying by 3 and dividing by 2 and you can start playing around with it right away. It’s very tempting to try,” said Marc Chamberland, a mathematician at Grinnell College who produced a popular YouTube video on the problem called “The Simplest Impossible Problem.”

But legitimate proofs are rare.

In the 1970s, mathematicians showed that almost all Collatz sequences — the list of numbers you get as you repeat the process — eventually reach a number that’s smaller than where you started — weak evidence, but evidence nonetheless, that almost all Collatz sequences incline toward 1. From 1994 until Tao’s result this year, Ivan Korec held the record for showing just how much smaller these numbers get. Other results have similarly picked at the problem without coming close to addressing the core concern.

“We really don’t understand the Collatz question well at all, so there hasn’t been much significant work on it,” said Kannan Soundararajan, a mathematician at Stanford University who has worked on the conjecture.

The futility of these efforts has led many mathematicians to conclude that the conjecture is simply beyond the reach of current understanding — and that they’re better off spending their research time elsewhere.

“Collatz is a notoriously difficult problem — so much so that mathematicians tend to preface every discussion of it with a warning not to waste time working on it,” said Joshua Cooper of the University of South Carolina in an email.

An Unexpected Tip

Lagarias first became intrigued by the conjecture as a student at least 40 years ago. For decades he has served as the unofficial curator of all things Collatz. He’s amassed a library of papers related to the problem, and in 2010 he published some of them as a book titled The Ultimate Challenge: The 3x + 1 Problem.

“Now I know lots more about the problem, and I’d say it’s still impossible,” Lagarias said.

Tao doesn’t normally spend time on impossible problems. In 2006 he won the Fields Medal, math’s highest honor, and he is widely regarded as one of the top mathematicians of his generation. He’s used to solving problems, not chasing pipe dreams.

“It’s actually an occupational hazard when you’re a mathematician,” he said. “You could get obsessed with these big famous problems that are way beyond anyone’s ability to touch, and you can waste a lot of time.”

But Tao doesn’t entirely resist the great temptations of his field. Every year, he tries his luck for a day or two on one of math’s famous unsolved problems. Over the years, he’s made a few attempts at solving the Collatz conjecture, to no avail.

Then this past August an anonymous reader left a comment on Tao’s blog. The commenter suggested trying to solve the Collatz conjecture for “almost all” numbers, rather than trying to solve it completely.

“I didn’t reply, but it did get me thinking about the problem again,” Tao said.

And what he realized was that the Collatz conjecture was similar, in a way, to the types of equations — called partial differential equations — that have featured in some of the most significant results of his career.

Inputs and Outputs

Partial differential equations, or PDEs, can be used to model many of the most fundamental physical processes in the universe, like the evolution of a fluid or the ripple of gravity through space-time. They arise in situations where the future position of a system — like the state of a pond five seconds after you’ve thrown a rock into it — depends on contributions from two or more factors, like the water’s viscosity and velocity.

Complicated PDEs wouldn’t seem to have much to do with a simple question about arithmetic like the Collatz conjecture.

But Tao realized there was something similar about them. With a PDE, you plug in some values, get other values out, and repeat the process — all to understand that future state of the system. For any given PDE, mathematicians want to know if some starting values eventually lead to infinite values as an output or whether an equation always yields finite values, regardless of the values you start with.

For Tao, this goal had the same flavor as investigating whether you always eventually get the same number (1) from the Collatz process no matter what number you feed in. As a result, he recognized that techniques for studying PDEs could apply to the Collatz conjecture.

One particularly useful technique involves a statistical way of studying the long-term behavior of a small number of starting values (like a small number of initial configurations of the water in a pond) and extrapolating from there to the long-term behavior of all possible starting configurations of the pond.

In the context of the Collatz conjecture, imagine starting with a large sample of numbers. Your goal is to study how these numbers behave when you apply the Collatz process. If close to 100% of the numbers in the sample end up either exactly at 1 or very close to 1, you might conclude that almost all numbers behave the same way.

But for the conclusion to be valid, you’d have to construct your sample very carefully. The challenge is akin to generating a sample of voters in a presidential poll. To extrapolate accurately from the poll to the population as a whole, you’d need to weight the sample with the correct proportion of Republicans and Democrats, women and men, and so on.

Numbers have their own “demographic” characteristics. There are odd and even numbers, of course, and numbers that are multiples of 3, and numbers that differ from each other in even subtler ways. When you construct a sample of numbers, you can weight it toward containing certain kinds of numbers and not others — and the better you choose your weights, the more accurately you’ll be able to draw conclusions about numbers as a whole.

Weighty Choices

Tao’s challenge was much harder than just figuring out how to create an initial sample of numbers with the proper weights. At each step in the Collatz process, the numbers you’re working with change. One obvious change is that almost all numbers in the sample get smaller.

Another, maybe less obvious change is that the numbers might start to clump together. For example, you could begin with a nice, uniform distribution like the numbers from 1 to 1 million. But five Collatz iterations later, the numbers are likely to be concentrated in a few small intervals on the number line. In other words, you may start out with a good sample, but five steps later it’s hopelessly skewed.

“Ordinarily one would expect the distribution after the iteration to be completely different from the one you started with,” said Tao in an email.

Tao’s key insight was figuring out how to choose a sample of numbers that largely retains its original weights throughout the Collatz process.

For example, Tao’s starting sample is weighted to contain no multiples of 3, since the Collatz process quickly weeds out multiples of 3 anyway. Some of the other weights Tao came up with are more complicated. He weights his starting sample toward numbers that have a remainder of 1 after being divided by 3, and away from numbers that have a remainder of 2 after being divided by 3.

The result is that the sample Tao starts with maintains its character even as the Collatz process proceeds.

“He found some way to continue this process further, so that after some number of steps you still know what’s going on,” Soundararajan said. “When I first saw the paper, I was very excited and thought that it was very striking.”

Tao used this weighting technique to prove that almost all Collatz starting values — 99% or more — eventually reach a value that is quite close to 1. This allowed him to draw conclusions along the lines of 99% of starting values greater than 1 quadrillion eventually reach a value below 200.

It is arguably the strongest result in the long history of the conjecture.

“It’s a great advance in our knowledge of what’s happening on this problem,” said Lagarias. “It’s certainly the best result in a very long time.”

Tao’s method is almost certainly incapable of getting all the way to a full proof of the Collatz conjecture. The reason is that his starting sample still skews a little after each step in the process. The skewing is minimal as long as the sample still contains many different values that are far from 1. But as the Collatz process continues and the numbers in the sample draw closer to 1, the small skewing effect becomes more and more pronounced — the same way that a slight miscalculation in a poll doesn’t matter much when the sample size is large but has an outsize effect when the sample size is small.

Any proof of the full conjecture would likely depend on a different approach. As a result, Tao’s work is both a triumph and a warning to the Collatz curious: Just when you think you might have cornered the problem, it slips away.

“You can get as close as you want to the Collatz conjecture, but it’s still out of reach,” Tao said.

This article was reprinted on Wired.com and in Spanish at Investigacionyciencia.es.