Have you ever found yourself looking up John Smith on Wikipedia, only to discover that there are 205 different John Smiths with Wikipedia pages? It’s a testament to the breadth of knowledge on Wikipedia, but it can also be kind of annoying: what if you just want to know the real deal about the English explorer John Smith’s encounter with Pocahontas?

I found myself in the above situation recently, and decided that it’d be interesting to know what is the longest disambiguation page on all of Wikipedia. John Smith has 205 entries, which seems like a lot, but maybe there are other generic terms that have even more Wikipedia entries?

Lots of John Smiths!

Luckily Wikipedia provides an alphabetical list of all ~250,000 disambiguation pages. I modified the Rap Genius Trackback Scraper to iterate through every disambiguation page, count up the number of list items in each page’s “may refer to” section, and store the results in a database.

Without further ado, the top 10 longest Wikipedia disambiguation pages:

St. Mary’s Church is the most ambiguous term on Wikipedia, followed by Communist Party, and Aliabad, which is apparently a common Persian town name. Now if only we could get one of the many Communist Parties to hold a group meeting at a St. Mary’s Church in an Aliabad…

Other tidbits:

It’s a bit surprising to see so many Persian town names at the top of the list. Closer investigation reveals that a single Wikipedia user, Carlossuarez46, seems to have contributed most of the edits to those pages

William Smith just beats out John Smith as the most ambiguous person, by a score of 211 to 205

The top scientific term is the species abbreviation C. elegans, with 223 “may refer to” links

Church names are heavily represented. The longest St. [name] Church formulations are: Mary - 584 John - 211 Peter - 197 George - 164 Michael - 159

And the longest First [branch] Church formulations: Lutheran - 279 Presbyterian - 230 Baptist - 218 Congregational - 94 Church of Christ, Scientist - 70

The distribution of disambiguation pages shows a heavy right skew Median length of 4 “may refer to” links Mean length is 7.1 Most common length is 2 25% of all disambiguation pages have length 2



Here’s a Google Spreadsheet with the top 1,000 longest pages, and you can download the full dataset as a .csv from GitHub