Wikipedia Web Services

March 30, 2011

Over the last couple of weeks a new data extract for the wikipedia web services was implemented and deployed. The major change is certainly the dramatically increased number of geo located wikipedia articles.

A new attribute ‘rank‘ has been added to the xml and json responses. It gives an indication of the popularity or relevancy of an article. The rank is an integer number from ‘1‘ for the least popular articles to ‘100‘ for the most popular articles. It is calculated from the number of links pointing to an article and the article length. The articles are more or less evenly distributed over the 100 ranks.

The ‘elevation‘ field is now filled for nearly all articles, where no elevation could be parsed from the article itself it was enhanced with a reverse geocoded value from srtm3 or aster. The ‘countryCode‘ coverage has also been improved. The attributes ‘population‘ and ‘elevation‘ are no longer set to ’0′ for unknown values, they are left empty instead.

About these ads

3 Responses to “Wikipedia Web Services”


  1. [...] обновила веб-сервис доступа к статьям с [...]


  2. Hello,

    I noticed their is still a lot of geo located wikipedia articles missing, has the data been loaded yet or still in testing?

    Andrew.

    • marc Says:

      the data has been loaded weeks ago. articles with weird geo templates are not included as the parser will miss them. It is a pity that people continue to invent new templates instead of using the exiting ones.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 30 other followers

%d bloggers like this: