Let’s talk about a small question as a way of introducing a big question.

How thick is the atmosphere?

How far does Earth’s atmosphere extend into space? In other words, how high can you go in altitude before you start to have difficulty breathing, or your bag of chips explodes, or you need to wear extra sunscreen to protect your skin from UV damage?

You probably have a good guess for the answer to these questions: it’s something like a few miles of altitude. I personally notice that my skin burns pretty quickly above ~10,000 feet (about 2 miles or 3 km), and breathing is noticeably difficult above 14,000 feet even when I’m standing still.

Of course, technically the atmosphere extends way past 2-3 miles. There are rare air molecules from Earth extending deep into space, becoming ever more sparse as you move away from the planet. But there’s clearly a “typical thickness” of the atmosphere that is on the order of a few miles. Altitude changes that are much smaller in magnitude aren’t noticeable, and altitude changes that are much larger give you a much thinner atmosphere.

What physical principle determines this few-mile thickness?

At a conceptual level, this is actually a pretty simple problem of balancing kinetic and potential energy. Imagine following the trajectory of a single air molecule (say, an oxygen molecule) for a long time. This molecule moves in a sort of random trajectory, buffeted about by other air molecules, and it rises and falls in altitude. As it does so, it trades some of its kinetic energy for gravitational potential energy when it rises, and then trades that potential back for kinetic energy when it falls. If you average the kinetic and potential energy of the molecule over a long time, you’ll find that they are similar in magnitude, in just the same way that they would be for a ball that bounces up and down over and over again.

There is actually an important and precise statement of this equality, called the virial theorem, which in our case says that

where is the average potential kinetic energy in the vertical direction and is the average potential energy.

The gravitational potential energy of a particle of mass is just

and the typical kinetic energy of the air molecule is related to the temperature, (this is, in fact, the definition of temperature):

,

where is Boltzmann’s constant and is the absolute temperature (i.e., measured from absolute zero). On the earth’s surface, is about 25 milli-electronvolts, or Joules.

Using these equations to solve for gives , which is about 5 miles.

Everything makes sense so far, but let’s ask a more interesting question: What is the function that describes how the thickness of the atmosphere decays with altitude? In other words, what is the probability density for a given air molecule to be at altitude ?

Let’s take a God-like perspective on this question [insert joke here about typical physicist arrogance]. Imagine that you could choose some function from the space of all possible functions, and in order to make your choice you must first ask the question: which function is best?

“Best” may seem like a completely subjective word, but in physics we often have optimization principles that let us define the “best solution” in a very specific way. In this case, the best solution is the one with the highest entropy. Remember that saying “this state has maximum entropy” literally means “this state is the one with the most possible ways of happening”. So what we are really searching for is the function that is most probable to appear from a random process.

The entropy of a probability distribution is

,

This is a generalization of the Boltzmann entropy formula (which is a sufficiently big deal that Boltzmann had it engraved on his tombstone).

Now, there are two relevant constraints on the function . First, it must be normalized:

.

Otherwise, it wouldn’t be a proper probability distribution.

Second, the distribution must correspond to a finite average energy. In particular, the average potential energy of an air molecule must be . Since the energy of a molecule with altitude is , we have the second constraint

.

Now, for those of you who read the previous post, this kind of problem should start to look familiar. To recap, we want

a function that maximizes some quantity

that maximizes some quantity and is subject to two constraints

This is a job for Lagrange multipliers!

To optimize the quantity using Lagrange multipliers, we start by writing the Lagrange function

.

Here, the two quantities in brackets represent the constraints. Putting in the expression for and then taking the derivative and setting it equal to zero gives

Since appears only in a logarithm, rearranging and solving for gives something like

Now we can use the two constraint (normalization and having a fixed expectation value of the energy) to solve for the values of the two constants. This procedure gives

,

where is the same “typical thickness” that we estimated at the beginning.

Maybe this seems like a funny little exercise in calculus to you, but what we just did is actually a big deal. We started with very little knowledge of the system at hand: we didn’t know anything about the composition of Earth’s atmosphere, or how air molecules collide with each other, or any principles of physics at all except for the high-school level formula for gravitational potential energy and the understanding that temperature is a measure of kinetic energy. But that was enough to figure out the precise formula for atmospheric density, just by demanding that such a formula must be the most likely one, in the sense of having the highest entropy.

And, it turns out, our derivation is pretty good. Here’s data from the Naval Research Laboratory:

Notice that the density of the atmosphere looks very much like an exponential decay (a straight line on this plot) up until about 80 km of altitude. At higher altitude there’s a sort of crazy increase in temperature (probably due to direct heating from solar radiation and an absence of equilibration with the thicker atmosphere below it) that slows down the decay of atmosphere density.

The Boltzmann Distribution

With a relatively small amount of work, we figured out how thick Earth’s atmosphere is, and how the that thickness depends on altitude.

But it turns out that what we really just did is something much bigger. We found a way to relate energy — in that last problem, expressed through altitude — to probability.

So let’s take a step back, and look over what we did while thinking of a much bigger, more general problem. Suppose that some system (it could be a single particle, or it could be a set of many particles) has many different configurations that it can take. Let’s say, generically, that the energy of some configuration has energy . Now let’s ask: what is the best probability distribution for describing how likely each configuration is?

Despite knowing literally nothing about the specifics of this problem, we can still approach it in exactly the same way as the last one. We say that the distribution must maximize the entropy:

,

while it is subject to the normalization constraint

and the constraint of having a finite average energy :

.

These equations all look identical to the ones we wrote down when talking about the atmosphere. So you can more or less just write down the answer now by looking at the previous one, without doing any work:

.

Now this formula is a really big deal. It is called the Boltzmann distribution.

The Boltzmann distribution allows you, very generically, to say how likely some outcome is based only on its energy. The only real assumption behind it is that the system has time to evolve in a sort of random way that explores many possibilities, and that its average quantities are not changing in time. (This set of conditions is what defines equilibrium.)

It’s a formula that rears its head over and over in physics, turning seemingly impossible problems into easy ones, where all the details don’t matter. I’m pretty confident that, if I had discovered it, I would put it on my tombstone also.

Footnote: