10 July 2014

Spatial distribution of geotagged Wikipedia articles (map)

Map of spatial distribution of geotagged Wikipedia articles























This map points out the highly uneven spatial distribution of (geotagged) Wikipedia articles in 44 language versions of the encyclopaedia. Slightly more than half of the global total of 3,336,473 articles are about places, events and people inside the red circle on the map, occupying only about 2.5% of the world’s land area.

Data
The map is based on Wikipedia data dumps encompassing 44 languages from November 2012. We excluded articles with more than four geotags, which typically consist of lists of geographic features. In the remaining data, we chose the most frequent geotag. If all geotags occurred only once, the first geotag (typically the most important one) was chosen as representative for the article. Additionally, we gathered article metrics such as number of characters and words in the article, the number of links to other Wikipedia articles, the number of external links and the number of in-article references. We mapped the article locations on top of a dataset that we obtained from Natural Earth using Buckminster Fuller’s Dymaxion map projection that has little distortion of shape and area and highlights that there is no ‘right way up’.

Findings
The map highlights the fact that a majority of content produced in Wikipedia is about a relatively small part of our planet. This finding supports previous work on the geographical biases of Wikipedia. Consider for example this visualization of the state of Wikipedia in 2010. We know that different language versions have varying shares of geocoded articles. English, Polish, German, Dutch and French are the Wikipedias with the largest numbers of geotagged articles. Since all these languages are spoken in Europe they may make a significant contribution to the dominant position of this continent in the above map.

By contrast, other continents are much less represented in the world’s most prominent digital repository of human knowledge. As we pointed out in the post about Africa on Wikipedia, the whole continent of Africa contains only about 2.6% of the world’s geotagged Wikipedia articles despite having 14% of the world’s population and 20% of the world’s land.

Further exploring the two groups represented in the map above (the inside and the outside of the red circle), we find that Wikipedia articles inside the circle have had a head start: they are on average a bit older than those outside. Especially in 2005 and 2006, editing activity about this European area picked up much faster than in the rest of the world.



more news links below (on mobile go to web version link below)



expVC.com Domain Name News Archive

expVC.com on Twitter