GeoNames Home | Postal Codes | Download / Webservice | About 

GeoNames Forum
  [Search] Search   [Recent Topics] Recent Topics   [Groups] Back to home page 
[Register] Register / 
[Login] Login 
alternate names guidelines  XML
Forum Index -> General
Author Message
marc



Joined: 08/12/2005 07:39:47
Messages: 4501
Offline

I have noted that some users are adding a lot of more or less identical variations of the same name. (with or with out dash, a version with 'Saint' and one with 'St' etc)
The search engine is able to deal with this differences and it is not required to add all possible variants as alternate name.

I have also updated the documentation accordingly


http://www.geonames.org/manual.html#alternateNames

Alternate names are used to store and display name variants. In the screenshot below we add the Italian name 'San Gallo' to the first order division 'Sankt Gallen'. We set the 'is short name' flag to indicate that this is a short name for the long name 'Cantone di San Gallo'.

Guidelines :

* the main name for the toponym should be a widely accepted international or English name. Local languages are entered as alternate names.
* the flag 'isPreferred' helps distinguish between several alternate names in the same language. It marks the most commonly used name.
* proper casing should be used. Upper case only for the first character of a term or for abbreviations.
* avoid redundancy. Do not add the same name with 'minus' and blanks between terms, only add the more often use spelling variant in this case. (Ex: "La Colle sur Loup" for "La Colle-sur-Loup", it is sufficient to add "La Colle-sur-Loup" as the geonames search engine can handle searches with or without minus.)
* use full spelling for 'Sankt' (German), 'Saint' and 'Sainte' (French). The respective abbreviations St. and Ste. are automatically handled by the search engine.  

[WWW]
david masclet



Joined: 26/11/2007 11:49:58
Messages: 67
Offline

The search engine is able to deal with this differences and it is not required to add all possible variants as alternate name.
 


i'm agree but the use of alternate names can be different from a search engine with such cappabilities.

I think that data must not be designed to suits to a particular applications.

Are you agree with this ?

David
[WWW]
jordanreiter



Joined: 02/06/2010 19:59:56
Messages: 3
Offline

marc wrote:
* use full spelling for 'Sankt' (German), 'Saint' and 'Sainte' (French). The respective abbreviations St. and Ste. are automatically handled by the search engine.  


I'd somewhat disagree with this one. It's very straightforward to do things like handle alternate punctuations programmatically. It's less straightforward to handle abbreviations. Although I can see the hassle in having thousands of additional records for St. and Ste.
alextorrenegra


[Avatar]

Joined: 08/09/2008 19:48:14
Messages: 13
Offline

Thank you Marc. I agree with David in reference to both the usage of abbreviations and the usage of dashes. I think they should be allowes as long as said names are commonly used.

The data, hopefully, should be useful regardless of the application. Otherwise every application that uses the GeoNames data should be aware of those rules and be able to handle those exceptions.


Alexander Torrenegra
http://letmego.com
marc



Joined: 08/12/2005 07:39:47
Messages: 4501
Offline

I agree that data must not be designed to suite a particular application and this exactly is the reason why it would be wrong to start trying to add all possible permutations of case, punctuation, white space etc. The data in the database should try to follow some consistency rules. There is no increase in information if the variant without dashes is added explicitly (or all-lowercase or all-uppercase or all possible combinations of mixed cases are added as alternate names). The only reason I could imagine that would speak for including dashes would be the existence of valid alternate names that have dashes at different positions or that a place with dashes can under no circumstances be written without dashes. I think they either have dashes at well defined positions or no dashes at all.
Adding all permutations to the raw data really does not make sense, if there is a demand for this than it would be much easier and straight forward to write a script to generate an additional datafile with all these redundancies. Maintaining these redundancies manually in a database really is not necessary and wouldn't help anybody. If all permutations are added as alternate names somebody looking for some consistency would have to write code to find the names with dashes etc, it is far easier to just remove the dashes rather than having to loop over all alternate names to find the one with the highest number of dashes.

Marc

[WWW]
geotree


[Avatar]
Joined: 23/07/2007 18:28:40
Messages: 138
Location: France
Offline

I do agree with Marc. There is no added-value to store in the database many variants upper/lowercase, with/without dashes, accentuation, etc...

This is already perfectly handled by search-engine and webservices :
view-source:http://ws.geonames.org/searchJSON?formatted=true&q=st%20etienne
returns : Saint-Étienne
etc...

I have started to delete those added for french places.

Christophe
geotree.geonames.org
geotree.geonames.org/geotree.html
[WWW]
 
Forum Index -> General
Go to:   
Powered by JForum 2.1.5 © JForum Team