GeoNames Home | Postal Codes | Download / Webservice | About 

GeoNames Forum
  [Search] Search   [Recent Topics] Recent Topics   [Groups] Back to home page 
[Register] Register / 
[Login] Login 
San Francisco, Cuba  XML
Forum Index -> General
Author Message
rlevering



Joined: 20/07/2009 20:00:19
Messages: 26
Offline

Try searching for San Francisco, Cuba and tell me what in the world went on there. Did some input script mess up or was the original data that dirty?
rlevering



Joined: 20/07/2009 20:00:19
Messages: 26
Offline

On this note, I just ran a duplicate detector on the DB...where I defined a duplicate to be anything with the same name, feature code, and all the same hierarchical breakdown (country,adm1,adm2). There are a very large number of duplicates in the database. I'm sure some of these may actually be different places, but the large majority that I saw were definitely import errors. Is there a strategy to handle these?
marc



Joined: 08/12/2005 07:39:47
Messages: 4501
Offline

This for a change really looks like the same toponym. The toponym is referring to an area and there are a lot of markers covering the entire area. I could imagine that one of the input sources was aggregating small subsets (like maps) and the toponym was on each of the map once and ended up n-times. Please feel free to clean it up, be careful with automated scripts. There are a couple of threads of users complaining about duplicates, but usually it is absolutely not clear whether they really are duplicates, often they are clearly not duplicates. There is just no law that makes place names unique even though it would make life easier for application developers.

Best

Marc

[WWW]
 
Forum Index -> General
Go to:   
Powered by JForum 2.1.5 © JForum Team