| Author |
Message |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 18/11/2011 22:08:05
|
shukuboy
Joined: 15/11/2011 22:27:32
Messages: 2
Offline
|
Hi,
I've been analysing my data and have found some extreme cases of superfluous repetition, for example in case of Krajan in Indonesia, there are 55 records in the cities1000, which all are almost identical except for the geo-coding coordinates which vary slightly. All records are marked as PPLA4.
In total there are around 2000 cities with duplicated ascii_name, admin1 code and country code combination, which range from 55 repetitions to 2 repetitions.
What's the best way of taking care of such instances. Are there any procedures for reporting them or do we have to remove them manually ?
Cheers,
Shuku
|
|
|
 |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 18/12/2011 06:13:52
|
marc
Joined: 08/12/2005 07:39:47
Messages: 4501
Offline
|
Hi Shuku
Just because the name is identical does not mean they are duplicates. As you mention the coordinates are different and they refer therefore to different locations.
Best
Marc
|
 |
|
|
 |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 30/10/2012 07:59:34
|
Backslider
Joined: 26/10/2012 02:22:56
Messages: 5
Offline
|
The only solution I have found is to use 'GROUP BY asciiname' in your query.
Lots of nice data, but lots of duplicates and poor thought given to it. For example, if I try to look up cities/towns for Australian Capital Territory, I am faced with all the SUBURBS of Canberra - this is next to useless for practical application.
I should be able to get just: Canberra, Hall.
|
|
|
 |
|
|