Place Name Disambiguation
May 27, 2006
One of the most interesting parts of the 'natural language geocoder' is Place Name Disambiguation. Depending on the context, the grammatical structure or the language a term may have one of several possible meanings. Examples :
- Hayden : the CIA director Michael Hayden or the city in Idaho.
- Java : the island or the programming language
- Brisbane : city in Australia or city in California, USA
- Como : city in Italy or a very frequent word in Spanish.
We are using several processing steps to tackle this problem. First we identify the language of the text and the contexts (Example : In an IT context the term java most likely stands for the programming language).
Then we try to find person names and we do some simple grammatical analysis using coocurrences of left and right neighbours (Example : If the term we are looking at is preceded by the expression 'south of', we can be nearly certain that the term has a geographical meaning.)