For the past century, an obscure mathematical principle called Zipf's law has predicted the size of mega-cities all over the world. And nobody knows why.


Illustration by Algol via Shutterstock

Back in 1949, the linguist George Zipf noticed something odd about how often people use words in a given language. He found that a small number of words are used all the time, while the vast majority are used very rarely. If he ranked the words in order of popularity, a striking pattern emerged. The number one ranked word was always used twice as often as the second rank word, and three times as often as the third rank. He called this a rank vs. frequency rule, and found that it could also be used to describe income distributions in any given country, with the richest person making twice as much money as the next richest, and so forth.


Later dubbed Zipf's law, the rank vs. frequency rule also works if you apply it to the sizes of cities. The city with the largest population in any country is generally twice as large as the next-biggest, and so on. Incredibly, Zipf's law for cities has held true for every country in the world, for the past century.

Photo by upthebanner via Shutterstock

Just take a look at the top ranked cities in the United States by population. In the 2010 census, the biggest city in the U.S., New York, had a population of 8,175,133. Los Angeles, ranked number 2, had a population of 3,792,621. And the cities in the next three ranks, Chicago, Houston and Philadelphia, clock in at 2,695,598, 2,100,263 and 1,526,006 respectively. You can see that obviously the numbers aren't exact, but looked at statistically, they are remarkably consistent with Zipf's predictions.


Paul Krugman, who wrote about applying Zipf's law to cities back in 2006, remarked famously:

The usual complaint about economic theory is that our models are oversimplified — that they offer excessively neat views of complex, messy reality. [With Zipf's law] the reverse is true: we have complex, messy models, yet reality is startlingly neat and simple.


The Power Law

In 1999, economist Xavier Gabaix wrote a much-cited paper where he described Zipf's law for cities as a power law, and showed how the size of U.S. cities could be mapped on a graph, like so:


Gabaix noted that this structure holds true even if cities are growing at chaotic rates. But he and other economists also noticed that this tidy power law structure tends to break down once you're no longer looking at mega-cities in the top ranks. Smaller cities, below the size of 100 thousand people, seem to obey a different law and show a more normal distribution of sizes.

At this point, you might be asking: But how exactly are you defining "city," anyway? When you're doing this kind of calculation, it seems arbitrary to say that Boston and Cambridge count as two cities, or that San Francisco and Oakland are separate entities, just because they are separated by bodies of water. Two Swedish geographers had exactly the same question, so they redefined a bunch of regions as "natural cities," based on connectivity of roads and populations rather than political boundaries. And what they found was that even these "natural cities" obeyed Zipf's law.


Why does Zipf's law work on cities?

So what is it about big cities that makes them show such a predictable distribution of population? As I said earlier, nobody is really sure. We know that city size expands via immigration, and that immigrants tend to flock to the biggest cities because they offer more opportunities. But immigration isn't enough to explain the power law that produces that perfect slope in Gabaix's graph above.


The reasons are also clearly economic, as large cities tend to produce the most wealth. And Zipf's law applies to income distribution. But again, we're left wondering why this power law might appear in those top-rank cities.


Image by JLR Photography via Shutterstock

There are also exceptions to Zipf's law, as a group of researchers reported in Nature last year. They found that the power law only applied if the group of cities were integrated economically, which would explain why Zipf's law will work if you look at cities in a given European nation, but not at the EU as a whole. They write:

In fact, historically, the geographic level for Europe, at which an integrated evolution is observed, is the national state, while in the US, the whole confederation, not each independent state, has collectively and organically evolved towards a distribution of cities that follows Zipf's Law. From this perspective, the US is an organic, integrated economic federation, while the EU has not yet become so, and shows little convergence to such an economic unit . . . It implies that any system which obeys this law must have internal consistency in its size distribution or its sample.


This would seem to support the idea that Zipf's law is a response to economic conditions, since it only works if you compare cities that are connected economically the way cities in a country are.

How Cities Grow

There's another odd rule that applies to cities, too. You could call it the 3/4 power law, and it has to do with the way cities use resources as they grow. It refers to the way cities become more sustainable as they grow. For example, if a city doubles in size, the number of gas stations it requires does not double. Instead, the city runs efficiently with only about 77% more gas stations. While Zipf's law seems to follow other social laws, the 3/4 power law imitates a natural law — one that governs how animals use energy as they get larger.


The Mathematician Steven Strogatz puts it like this:

For example, suppose you measure how many calories a mouse burns per day, compared to an elephant. Both are mammals, so at the cellular level you might expect they shouldn't be too different. And indeed, when the cells of 10 different mammalian species were grown outside their host organisms, in a laboratory tissue culture, they all displayed the same metabolic rate. It was as if they didn't know where they'd come from; they had no genetic memory of how big their donor was. But now consider the elephant or the mouse as an intact animal, a functioning agglomeration of billions of cells. Then, on a pound for pound basis, the cells of an elephant consume far less energy than those of a mouse. The relevant law of metabolism, called Kleiber's law, states that the metabolic needs of a mammal grow in proportion to its body weight raised to the 0.74 power. This 0.74 power is uncannily close to the 0.77 observed for the law governing gas stations in cities. Coincidence? Maybe, but probably not. There are theoretical grounds to expect a power close to 3/4. Geoffrey West of the Santa Fe Institute and his colleagues Jim Brown and Brian Enquist have argued that a 3/4-power law is exactly what you'd expect if natural selection has evolved a transport system for conveying energy and nutrients as efficiently and rapidly as possible to all points of a three-dimensional body, using a fractal network built from a series of branching tubes — precisely the architecture seen in the circulatory system and the airways of the lung, and not too different from the roads and cables and pipes that keep a city alive.


This is terrifically fascinating, but is ultimately less mysterious than Zipf's law. It's not difficult to understand why a city — which is essentially an ecosystem, albeit one built by humans — should follow natural laws. But Zipf's law is something that seems to have no natural analogue. It's social, and as I mentioned earlier, it only holds true for cities over the past 100 years.

All we know is that Zipf's law applies to a lot of other social systems, including economic and linguistic ones. So it's possible that there are general social rules at work driving this odd rank vs. size rule, which one day we may understand. Whoever can puzzle it out may find that they have the key to predicting a lot more than urban growth. Zipf's law may be just one aspect of a fundamental rule of social dynamics that underwrites how we communicate, trade, and form communities with each other.


Annalee Newitz is the editor-in-chief of io9. She is the author of Scatter, Adapt and Remember: How Humans Will Survive a Mass Extinction.


Thanks to Mikolaj Szabó for discussing power laws and lognormal distributions!