Visualizations

Stephen Von Worley from datapointed.net has published some beautiful visualizations of the GeoNames database. Each of the 7.5-million geographic features is represented by a single dot, colored additively: blue for water, green for land, and red for manmade structures.

US places
datapointed.net

Also interesting to note, besides their pure beauty, are some artifacts (for instance a band in Ireland with lower feature density) clearly visible on the visualizations. Some of them (like the US-Canadian border) are caused by different data providers (usgs vs geobase) and therefore different feature density or feature type assignment. Others are caused by the data entry process itself. A lot of GeoNames data is coming from paper maps and you can see in some areas the form of the original maps used as data source.

Advertisement

New Countries: Curaçao, Sint Maarten and “Bonaire, Saint Eustatius and Saba”

Three new countries came into being after the dissolution of the Netherlands Antilles in October 2010. Curaçao and Sint Maarten became countries within the Kingdom of the Netherlands whereas Bonaire, Saba, and Sint Eustatius became special municipalities of the Netherlands proper.  ISO assigned the code BQ to the three BES islands, which results in the following table:

ISO2 ISO3 ISO# FIPS CURR
Bonaire, Saint Eustatius and Saba BQ BES 535 NL USD
Curaçao CW CUW 531 UC ANG/CMG
Sint Maarten SX SXM 534 NN ANG/CMG

The island of Saint Martin is divided into the French northern part (MF) and the Dutch southern part (SX). The French northern part seceded from Guadeloupe in 2007.

Note: The newly assigned ISO code ‘BQ’ may cause some issues as it was already in use in the past. Till 1979 it referred to the British Antarctic Territory (codes BQ, ATB) before it was merged with Antarctica (AQ).

GeoNames Ontology 2.2

Version 2.2.1 of the GeoNames Ontology has been released. Navigation within the administrative hierarchy has been made easier with the addition of the GeoNames URIs for the parentCountry and the parentADM(1-4) divisions. The proprietary names are now subproperties of the standard rdfs:label and skos:altLabel.

The properties ‘nearby’ and ‘neighbour’ have been implemented inline within the feature document for the Premium Data Subscription.

Changes:

  • add and use ‘gn’ namespace gn=”http://www.geonames.org/ontology#” for GeoNames Ontology properties
  • add namespace rdfs=”http://www.w3.org/2000/01/rdf-schema#”
  • add new property rdfs:isDefinedBy
  • change proprietary property gn:name to be subproperty of rdfs:label
  • change proprietary property gn:alternateName to be subproperty of skos:altLabel
  • new gn:officialName for alternate names with ‘isPreferred’ flag
  • new gn:shortName for alternate names with ‘isShort’ flag
  • new parentCountry
  • new parentADM(1-4)
  • implemented ‘nearby’ (within the document, only available with the Premium Data Subscription)
  • implemented ‘neighbour’ (within the document, only available with the Premium Data Subscription)

7. Oct Update: The current version is 2.2.1 and fixes some issues with name properties in v2.2. Version 2.2 is deprecated.

Premium Data Subscription

GeoNames is launching a Premium Data Subscription as a quality offering for professional users. The Premium Data is released monthly on an annual subscription basis. The first release ‘September 2010’ is now available.

Two approaches are used to find and eliminate errors and inconsistencies. On the one hand all modifications are monitored and sorted by relevance. All relevant modifications are verified and confirmed by a member of the GeoNames team, on the other hand over a hundred consistency checks have been implemented to spot quality problems.
The Premium Data also includes release notes, documentation and additional data files. It is available in csv and rdf format.

Sponsoring GeoNames

GyreoNames is looking for sponsors as the cost for servers and bandwidth is increasing. We had got an additional server for the download/dump  and we had to reduce the max number of allowed credits on the free services to cope with the growing number of requests.

GeoNames Sponsors are listed on the high ranking donation page with logo and link.

Benefits for Sponsors:

  • Logo and link on sponsors/donation page
  • Helps cover costs for running the project
  • Ensures the sustainability and health of the project.
  • Provide funding to improve consistency, fix errors and add missing data

MyTrip

UK open public data

Ordnance Survey, the British national mapping agency, has released open data at the beginning of this month. This is a 180 degree change for Britain. It used to have one of the most restrictive public data policies and hardly any official geo data was available. The new British open data policy is one of the most liberal policies worldwide and we can hope that other countries, in particular in Europe, will also understand the importance of unrestricted access to public data and adopt similar open policies.

The data is not only called ‘open’ it really is open. Unlike other ‘open data’ projects that use unopen share-alike licenses,  the OS OpenData is available under a so called ‘OS OpenData License‘, which is aligned and interoperable with the creative commons attribution license.

Of interest for GeoNames are several datasets:

  • 1.7 million postal codes (code-point)
  • 1:50’000 Gazetteer (260’000 toponyms)
  • admin boundaries

It will take some time and work till we can make optimal use of the data. The existing ‘outcode’ based postal codes have already been replaced with the new data. On the free ws.geonames.org server the postal code webservices (reverse and search) are now returning the full 1.6 million postalcodes. The dataset does not include postal codes for NIR, IM, GY and JE. For those regions we continue using the previous data.

The postal code dump directory now contains two files for GB, the default file with the outcodes and an additional file with the full postal codes. It is not yet clear whether we should continue with the outcodes or replace them entirely with the full codes. What do you think? Please comment your ideas and requirements below.

In order to use the admin boundaries we will have to clean up the existing admin divisions and align them with the Ordnance Survey divisions.

The gazetteer unfortunately only has a very high level ‘feature code’ and for the majority of the toponyms the feature code is missing entirely (X).

fcode | count
-------+--------
X | 128662 (all other features)
O | 41228 (other)
FM | 34723 (Farm)
W | 24425 (Water)
H | 14524 (Hill or mountain)
F | 8708 (Forest or wood)
A | 5252 (antiquity non-roman)
T | 1259 (town)
R | 237 (roman antiquity)
C | 62 (city)

DDOS part II

The free web services are timing out at the hour precisely for a short moment of some seconds. The reason is a widget that calls the services timezoneJSON and  findNearByWeatherJSON always at exactly the full hour from a large number of ip addresses. The sudden spike in requests is causing many other requests to timeout. Around a year ago the free services were suffering from the effects of an iphone application that has become very popular and was using some geonames web services.

Some hours ago we have changed the service to throw an exception hoping that the developer of the widget will see that the application no longer works and change the behavior of the application. It is not very useful if a distributed application running on a huge number of clients is calling the same server at the very same instance.

The exception is thrown on the domain ws.geonames.org for requests of the two JSON services and if no parameter username is present. If you happen to be using the service, just add the parameter username=<your geonames username> avoid the exception. Those using a ‘secret’ domain name are not affected. You can create an account here.

‘XK’ country code for Kosovo

Some of you have already noticed that we are now using ”XK‘  as temporary country code for Kosovo. While the US standards body ‘FIPS’  has found it worthwhile to assign a country code to Kosovo (KV), the International Organization for Standardization, ISO, has yet to assign a code to the former Serbian province. The ISO country code standard 3166 has a couple of unused codes that can be used for user specific elements: “If users need code elements to represent country names not included in this part of ISO 3166, the series of letters AA, QM to QZ, XA to XZ, and ZZ, and the series AAA to AAZ, QMA to QZZ, XAA to XZZ, and ZZA to ZZZ respectively and the series of numbers 900 to 999 are available.

The European Commission and many other organisations (Deutsche Bundesbank, Switzerland) are using ‘XK‘ as a temporary country code for Kosovo till ISO officially assigns a code.

GeoNames will switch to the official ISO code as soon as it has been released. In the meantime we will use ‘XK‘.

Links for Toponyms

A new pseudo language code ‘link‘ has been added to the alternate name edit function and the links to the English Wikipedia have been inserted as alternate names. The links to the corresponding wikipedia articles have often been requested. While they were available on the forum linked in some threads, they were not included in the normal dump. With this simple change they can now be included in the dump as alternate names and they can easily be maintained using the wiki interface. All other kind of links, I think of hotel websites for hotel entries, can also be added in the same manner.

The language code for the alternate names are normally the 2-character ISO 639 language codes, for more exotic languages that do not have a 2-character ISO code the 3-character code is used instead.

Pseudo codes

  • post‘ for postal codes
  • link‘ for a link to a website
  • iata‘, ‘icao‘ and ‘faac‘ for the respective airport codes
  • abbr‘ for an abbreviation
  • fr_1793‘ for names used during the French Revolution

New York Times data API with GeoNames

NYT Logo

The New York Times is adding GeoNames data to their subject headings and making it available as Linked Open Data under a cc-by license.

For ages the NYT has indexed articles with keywords (tags/subject headings) from an extensive vocabulary. Thousands of these keywords have now been mapped to their respective geonameId. This will help for instance enhance the search function with additional information from geonames like lat/lng for a reverse geocoded article search. The vocabulary is available under a cc-by license in various ways and formats. You can download a huge file or you can browse individual entries on the NYT website. As an example the keyword ‘Zurich’ in html or rdf format. The GeoNames data in the rdf format is using the GeoNames ontology. The subject headings can be used by developers to query the NYT api for articles on the topic.