Give Me A Lever



The role of levers in finding mathematical proofs



Archimedes of Syracuse (c. 287 BC to c. 212 BC) was a Greek mathematician, whose insights led him to many inventions and practical ideas, and to work in physics and astronomy.

Today I wish to talk about the mathematical equivalent of the lever.



Archimedes invented many things, but not the lever, which may have been invented by Archytas of Tarentum. Archimedes is famous for the quote:

Give me a place to stand on, and I will move the Earth.

Of course what he meant was: in principle with a large enough lever and a place to stand the strength needed to move even something as heavy as the Earth would be possible. This is pretty impressive given that the Earth’s mass is kg. His standing place would have needed to be somewhere beyond the Andromeda Galaxy—see here for some simple estimates—but his mathematical proof of possibility needed no galactic figures.

Levers

I was just recently in Ann Arbor at the Coding, Complexity and Sparsity Workshop, which was organized by Anna Gilbert, Martin Strauss, Atri Rudra, Hung Ngo, Ely Porat, and Muthu Muthukrishnan. It was a wonderful experience, and I am planning shortly to discuss some of the great talks that were given there. The workshop website should soon have the talks available so that you can at least see the slides, although there is nothing like being in the room listening to a good talk.

I realized during the workshop that several of the talks—not all—had essentially used a lever to solve their particular problem. In this sense a lever is some trick, insight, or idea that one allows one to make a start on solving a problem. It is not the full solution; it is not an essential idea. There may be ways to solve the problem without the lever, but the lever does allow paraphrasing Archimedes: Give me a place to start, and I will prove the theorem.

One thing that I realized too was that papers and even talks often gloss over any lever. For some reason the writers and speakers do not think it is worth making a big deal about it. One possible reason is that to them, who are likely experts in the area, the lever seems so simple—why state it explicitly? Another reason is that the lever often is a pretty simple idea, so why make a big deal out of it? Often other parts of the proof are much harder and more technical.

An example of a trivial lever is a transformation that is used often in analysis. Suppose one needs to prove something about a function . Replace the function by

Often this change does not affect the theorem being proved, but now , which could dramatically reduce the number of cases that are needed later in the proof. It is a lever in my sense.

Archimedes’ Lever

It seems that Archimedes himself actually used the lever as a proof lever in my sense. Or rather he used the lever to convince himself something was true, and then generated a proof by other means.

Archimedes is known for estimating or computing areas and volumes by the older method of exhaustion, which sandwiches the object being analyzed between simpler shapes, computes their areas or volumes, and uses them for upper and lower bounds or to demonstrate convergence. For example, by wrapping a sphere of radius in progressively tighter polygonal bounds, one can prove the value for the volume. However, it seems this was not his idea of first resort.

According to scholarship summarized here, Archimedes instead considered a cone of height and base radius . He found that for each , , he could hang a slice of the cone and a slice of the sphere whose areas added up to at distance on one side of the lever. These would balance a circle of radius hung at distance on the other side of the lever. The circles formed a cylinder of height whose center of gravity was at distance on the other side. Since all the mass on the near side was at distance , the two masses and hence volumes had to be equal. Since Archimedes knew the cylinder had volume and the cone had volume , he obtained volume for the sphere.

Archimedes was so enchanted by this that he had a cylinder and a sphere placed on his gravestone. We believe he felt that his infinitesimal slices presaged a new way of calculating—which Newton and Leibniz turned into the calculus 1,800 years later. Perhaps he even perceived the singular difference between his foliations of the sphere and cone on one side, and the cylinder on the other side.

Two Other Simple Examples

Maximum-likelihood estimation often involves setting parameters to maximize the probability of a series of independent events. Since the events are independent, this is just the product

of the events. With many events, may be a very small number, so that numerical accuracy becomes a problem, and differentiating this formula to find a maximum may also consume much effort. However, since the logarithm is a continuous strictly increasing function in the range of these probabilities, it is equivalent to maximize the log of this product. If one wishes to keep all quantities non-negative, one can instead minimize

This preserves numerical accuracy and is easier to differentiate.

The paper “How Powerful are Random Strings” by Eric Allender, Luke Friedman, and William Gasarch, which we featured here, also starts with a lever. Instead of using the random-string set directly as an oracle, they create a different oracle out of the overgraph of a related entropy function . They show that the two oracles are equivalent for the complexity reductions used in their main theorem, but find the overgraph-oracle easier to analyze. As usual see their paper for details.

A Nontrivial Example

I would like to give a beautiful example of a lever from one of the talks at the workshop. David Woodruff spoke on “ —Approximate Sparse Recovery, which is based on joint work with Eric Price, and will appear this fall at FOCS 2011.

The problem is to discover a -sparse vector that approximates a given signal. Here -sparse means that the vector has at most non-zero coordinates. This is a major problem in the area of approximation and compressed sensing. I will not try to even begin to survey this huge body of work.

David at the beginning of his talk said: let’s consider just the special case of . This was his lever. He did not make any fuss about the lever, and he proceeded to use this lever in both upper bound and lower bound results. He did point out that the ability to solve the case where , where there is one large signal, is easily justified. One can take the input vector

and use random sampling to divide the signal up into pieces. Likely the pieces will have one big signal—that is, it is likely that will hold. Then one could use the algorithm on each piece to find all the large signals. There is a bit more needed to make this upper reduction process, but the idea is fairly simple. A great lever.

Once David uses this lever, all his calculations after that are much simpler. There is a fair bit of estimation and employment of inequalities needed to make the recovery algorithm work correctly. These calculations are much easier to follow, and probably were easier to discover in the first place, by restricting the situation to .

This Lever Extends

David, in his talk, also outlined several lower bound theorems. Each was for different versions of the the recovery problem, since there were several parameter regimes and models.

Lower bounds are hard to prove in general—we have so few of them. But in this area of signal recovery the general paradigm is possible since the lower bounds are essentially counting how many measurements are needed to solve a recovery problem. This sounds like information theory or communication complexity theory, and it is. David shows how to use known—often very deep—results from both areas to prove that too few measurements would lead to a violation.

Again the lever to the recuse. The proof for the case is often not too difficult—that is not to say easy. Then to prove the general case one must be careful. Lower bounds on computing one object can change when we are trying to compute several objects. But the lever helps point out that this obstacle must be addressed. David uses some communication product theorems—again some are quite powerful and deep—to prove his lower bounds.

Open Problems

What are some of your favorite levers? Is this notion helpful to make explicit?