Geotagged Wikipedia Articles
April 2, 2008
We have updated our database of geotagged Wikipedia articles and increased the total number of articles to 1.2 million up from 800′000. The most popular language is still English with 170′000 articles (up from 137′000) followed by Dutch with 107′000 (up from 67′000). Fifth is “Volapük” a language I have to admit I have never heard of before. It is a constructed language derived from English and German. Most articles in Volapük, which literally translates to ‘world speak’, are stubs created by wikipedia bots.
The number of entries for German would have decreased hadn’t it been for our merging the previous parse result with the newest parse. The decrease is mainly caused by wikipedians who develop bots to alter established templates into new templates. The new templates are used only for a minuscule fraction of articles. This trend seems to show that while the wikipedia approach works well for unstructured textual data it does not work so well for structured data.
Wikinear
An application quite popular in the Blogosphere these days is Wikinear. It is a very simple application for mobile phones that makes use of some interesting new technologies and web services : OAuth, Fire Eagle, GeoNames and the Google Static Maps API.

April 25, 2008 at 2:16 pm
We’ve implemented this great feature in Fresh Logic Studios – Atlas. It hasn’t made it into the menu system yet, but you can access it from this URL: http://atlas.freshlogicstudios.com/?Wikipedia
More info here: http://blogs.freshlogicstudios.com/Posts/View.aspx?Id=2afe58af-ae92-44f3-a326-0781bda8d76e
June 27, 2008 at 8:21 pm
I’m curious about what tag you scan for when you collect the geotagged wikipedia artices. I know there are a few types out there. My guess is that you’re scanning for the coor template: {{coor}}.
I’ve noticed that there are a number of articles with other types of geotags that don’t appear to be in the geonames database.
Also, I’m curious about how frequently the database is updated. I’m in the process converting the geotags on some articles from older styles to {{coor}} style in the hope that geonames will pick them up, but I don’t know how long I’ll have to wait to see them appear.
I’m building a little app that is very similar to wikinear.com and that’s why I’m asking.
Thanks!
June 29, 2008 at 8:46 am
Hi Dave
Yes, this is correct we are parsing the ‘coor’ template together with some others. We would wish to see consensus in the wikipedia community about a common template, one problem we are running into with every dump is that some articles disappear from our parsing as the template has been changed to something new and fancy. It is a real pity that many wikipedians don’t understand the data value that would be included in a common template.
How often we load it depends on the problems we are facing. It sometimes happens that we are missing a lot of articles for a language. Then we stop and wait another month hoping that other wikipedia users will revert back to a template our parser understands.
Marc
July 1, 2008 at 12:59 am
Marc, thanks for your answer and keep up the good work!
November 14, 2009 at 11:27 am
Easiest way to get past the parsing template issue is if Wikipedia defines a standard tag for all Wikipedia pages and incorporates that as a feature on Wikipedia so even non-authors can contribute by quickly geotagging all articles that are relevant to their location that they know.
An example would be something like Flickr has for after a non-geotagged photo was uploaded, or even like Panoramio has. But in Wikipedia’s case we may want any contributor to be able to geotag a page.
This would dramatically speed up and accurately place the geotagging through crowdsourcing (sort of like Wikipedia intended in the first place).
I was very frustrated when i used Layar the other day and knew there were Wikipedia page places right in front of me but nothing appeared in Layar and I am clueless as to how to go and tag those articles myself now.
November 14, 2009 at 12:12 pm
Update to my comment above…. seems that most pages are in fact geotagged now but many are just not appearing in the Layar Wikipedia layer. I have managed to update one incorrect coordinate on Wikipedia itelf without too much difficulty but would be great still if that interactive map also allowed interactive updates (but it would have to still prompt for a reason for the change).