What happens if you map every geotagged Wikipedia article - and then analyse it for language use? A team of Oxford University researchers has found out

What kind of global project is Wikipedia really? Do people just write about things close to home, or is information truly from around the world?

Mark Graham and the team at the Oxford Internet Institute (who've mapped zombies and every geotagged picture on Flickr) decided to find out as part of their research into the state of the internet - and then break it down by different languages.

Graham, who also runs the blogs floatingsheep.org and zerogeography.net looked at Wikipedia in the Middle East, North Africa, and East Africa in the November 2011 versions of the Arabic, Egyptian Arabic, English, French.

Interestingly, this doesn't work well on one map - largely because articles about the same geotagged place are often reproduced in other languages, too. Instead, the world is split in different ways.

So, first, they took on English Wikipedia. "This encyclopedia is by far the largest, and currently hosts almost 700,000 geotagged articles," says Graham.

Each one of the yellow dots represents the "human effort that has gone into describing some aspect of a place". Says Graham:

The density of this layer of information over some parts of the world is astounding. Some of our future posts will look more closely at measures of inequality in Wikipedia, but it is still hard not to be awed by this cloud of information about hundreds of thousands of events and places around the globe

Then they looked at other languages too - not the most populous, but still interesting examples of the spread of Wikipedia. Click on the images below to see them full-size.

French

Arabic

Egyptian Arabic

Hebrew

Persian

Swahili

These are admittedly relatively small: Arabic has 24,000 entries, Hebrew has 15,000, Persian has 21,000, and Egyptian Arabic has only slightly more than 1000.

Says Graham, there are some:

strange patterns on parts of these maps. If you look closely at the Arabic or Persian maps you might see some interesting patterns (for instance look closely at the patterns in the US). You see a similar sort of unexpected spatial distribution of articles in the map of Swahili Wikipedia (i.e. why are there so many articles in Turkey?). The answer is simply a few dedicated editors creating stub articles about relatively structured topics such as cities in Turkey (in the Swahili Wikipedia) or every county in the US state of Georgia (in the Arabic Wikipedia).

What do you think it says about Wikipedia?

More open data

Data journalism and data visualisations from the Guardian

World government data

• Search the world's government data with our gateway

Development and aid data

• Search the world's global development data with our gateway

Can you do something with this data?

• Flickr Please post your visualisations and mash-ups on our Flickr group

• Contact us at data@guardian.co.uk

• Get the A-Z of data

• More at the Datastore directory

• Follow us on Twitter

• Like us on Facebook