Testing Benford’s Law

Imagine a large dataset, say something like a list of every country and its population.

Country Population Afghanistan 29,117,000 Albania 3,195,000 Algeria 35,423,000 Andorra 84,082 Angola 18,993,000 ↑ Leading digit

Chances are, the leading digit will be a 1 more often than a 2. And 2s would probably occur more often than 3s, and so on.

This odd phenomenon is Benford's Law. If a set of values were truly random, each leading digit would appear about 11% of the time, but Benford's Law predicts a logarithmic distribution. It occurs so regularly that it is even used in fraudulent accounting detection.

See the Wikipedia article for a more thorough discussion.

This is a simple experiment to see how many large, publicly accessible datasets satisfy Benford’s Law.

This site is on GitHub. Please help out by forking the project and adding more datasets.