This semester I’m teaching a Discrete Mathematics course. Recently, I assigned them a homework problem from the textbook that asked them to prove that the binary operator on the real numbers is associative, that is, for all real numbers , , and ,

.

You might like to pause for a minute to think about how you would prove this! Of course, how you prove it depends on how you define , so you might like to think about that too.

The book expected them to do a proof by cases, with some sort of case split on the order of , , and . What they turned in was mostly pretty good, actually, but while grading it I became disgusted with the whole thing and thought there has to be a better way.

I was reminded of an example of Dijkstra’s that I remember reading. So I asked myself—what would Dijkstra do? The thing I remember reading may have, in fact, been this exact proof, but I couldn’t remember any details and I still can’t find it now, so I had to (re-)work out the details, guided only by some vague intuitions.

Dijkstra would certainly advocate proving associativity of using a calculational approach. Dijkstra would also advocate using a symmetric infix operator symbol for a commutative and associative operation, so let’s adopt the symbol for . ( would also be a reasonable choice, though I find it less mnemonic.)

How can we calculate with ? We have to come up with some way to characterize it that allows us to transform expressions involving into something else more fundamental. The most obvious definition would be “ if , and otherwise”. However, although this is a fantastic implementation of if you actually want to run it, it is not so great for reasoning about , precisely because it involves doing a case split on whether . This is the definition that leads to the ugly proof by cases.

How else could we define it? The usual more mathematically sophisticated way to define it would be as a greatest lower bound, that is, “ if and only if and and is the greatest such number, that is, for any other such that and , we have .” However, this is a bit roundabout and also not so conducive to calculation.

My first epiphany was that the best way to characterize is by its relationship to . After one or two abortive attempts, I hit upon the right idea:

That is, an arbitrary is less than or equal to the minimum of and precisely when it is less than or equal to both. In fact, this completely characterizes , and is equivalent to the second definition given above. (You should try convincing yourself of this!)

But how do we get anywhere from by itself? We need to somehow introduce a thing which is less than or equal to it, so we can apply our characterization. My second epiphany was that equality of real numbers can also be characterized by having the same “downsets”, i.e. two real numbers are equal if and only if the sets of real numbers less than or equal to them are the same. That is,

Now the proof almost writes itself. Let be arbitrary; we calculate as follows:

Of course this uses our characterization of via its relationship to , along with the fact that is associative. Since we have proven that if and only if for arbitrary , therefore .