Geotagged Wikipedia Articles
April 2, 2008
We have updated our database of geotagged Wikipedia articles and increased the total number of articles to 1.2 million up from 800’000. The most popular language is still English with 170’000 articles (up from 137’000) followed by Dutch with 107’000 (up from 67’000). Fifth is “Volapük” a language I have to admit I have never heard of before. It is a constructed language derived from English and German. Most articles in Volapük, which literally translates to ‘world speak’, are stubs created by wikipedia bots.
The number of entries for German would have decreased hadn’t it been for our merging the previous parse result with the newest parse. The decrease is mainly caused by wikipedians who develop bots to alter established templates into new templates. The new templates are used only for a minuscule fraction of articles. This trend seems to show that while the wikipedia approach works well for unstructured textual data it does not work so well for structured data.
An application quite popular in the Blogosphere these days is Wikinear. It is a very simple application for mobile phones that makes use of some interesting new technologies and web services : OAuth, Fire Eagle, GeoNames and the Google Static Maps API.