Geotagged Wikipedia Articles
April 2, 2008
We have updated our database of geotagged Wikipedia articles and increased the total number of articles to 1.2 million up from 800′000. The most popular language is still English with 170′000 articles (up from 137′000) followed by Dutch with 107′000 (up from 67′000). Fifth is “Volapük” a language I have to admit I have never heard of before. It is a constructed language derived from English and German. Most articles in Volapük, which literally translates to ‘world speak’, are stubs created by wikipedia bots.
The number of entries for German would have decreased hadn’t it been for our merging the previous parse result with the newest parse. The decrease is mainly caused by wikipedians who develop bots to alter established templates into new templates. The new templates are used only for a minuscule fraction of articles. This trend seems to show that while the wikipedia approach works well for unstructured textual data it does not work so well for structured data.
Wikinear
An application quite popular in the Blogosphere these days is Wikinear. It is a very simple application for mobile phones that makes use of some interesting new technologies and web services : OAuth, Fire Eagle, GeoNames and the Google Static Maps API.
Wikipedia Thumbnail Images
February 15, 2007
The first wikipedia load this year has brought the total number of georeferenced wikipedia articles available on geonames to 611,758. English will soon cross the magic number of 100,000 (current=99,333) after 81,282 in November.
Thumbnail images for wikipedia articles are a new experimental addition to the geonames webservices, the full text search and the maps mashup. Around a third of all articles on geonames have thumbnail images. A simple algorithm determines which image to use as thumbnail if more than one image could be parsed from the original article.
Wikipedia Load
November 5, 2006
With the newest load we have added French to the languages for which geonames supports wikipedia fulltext search and text blurbs. These features are now available in English, German, French, Spanish and Polish.
The total number of entries since July has increased from 230,000 to 500,000, an increase of 117%. The number of entries in English has increased from 53,000 to 81,000 (52%) and the number of entries in German from 38,000 to 51,000 (26%). Other languages are catching up. Dutch, French and Italian also have around 50,000 geolocated entries.
You find detailed numbers on the geonames wikipedia page :
The numbers of previous months are available here :
Wikipedia Load
July 16, 2006
With the newest load we have added Polish to the languages for which geonames supports wikipedia fulltext search and text blurbs. These features are now available in English,German, Spanish and Polish.
The total number of entries has increased from 180,000 to 230,000 which means an increase of 27%. The number of entries in English has increased from 44,000 to 53,000 (20%) and the number of entries in German from 32,000 to 38,000 (18%).
You find detailed numbers on the geonames wikipedia page :
The numbers of previous months are available here :
Wikipedia load
April 23, 2006
Today I have loaded Geonames with the newest Wikipedia dumps for English, German and Spanish.
Here some numbers comparing the load of today with 4 March 2006. (I didn’t have time to generate numbers for the load of 2 April 2006).
The total number of entries has increased from 155,000 to 180,000 which means an increase of 17%. The number of entries in English has increased from 40,000 to 44,000 (10%) and the number of entries in German from 28,000 to 32,000 (17%). The number of languages is still around 190.
You find detailed numbers on the geonames wikipedia page :
Numbers of previous months are backed up here :
Bye the way, the geonames webservice “findNearbyWikipedia” has recently become the service most often called per day among all our webservices. A close second is the “full text search“.
