We have updated nearly 60’000 populated places with their Cyrillic name variants, thanks to the help of Валерий Хронусов (Valery Hronusov, Ph.D.) who provided a Russian dataset of over 70’000 populated places. For the matching of the existing geonames database with the Russian dataset we used the featureclass, the location (geographical distance) and the name similarity of the transcribed Cyrillic names with the existing international (English) name.
As a name similarity measurement we used the Levenshtein distance (edit distance) and the letter pair similarity. The GOST system was used for the transcription of Cyrillic names into English. The Cyrillic place name Логиновка will become Loginovka.
The same transcription is also used for the geonames search engine. This means a search with a Cyrillic place name may also return the correct place name, even if the Cyrillic name is not yet included as an alternate name in the geonames database. A query for Логиновка not only returns the two cities in the Omskaya and Saratovskaya provinces but also the city in Bashkortostan, even though we don’t yet have the Cyrillic name for the latter.