GeoNames Home | Postal Codes | Download / Webservice | About 

GeoNames Forum
  [Search] Search   [Recent Topics] Recent Topics   [Groups] Back to home page 
[Register] Register / 
[Login] Login 
ADM Duplicates  XML
Forum Index -> Administrative Divisions
Author Message
nicolas.helleringer



Joined: 11/08/2009 13:48:39
Messages: 6
Offline

I have been manipulating geoname administrative data for a few days/weeks.

I did found some duplicates :

select country, admin1, count(*) from geoname where fclass='A' and fcode='ADM1' and admin1 is not null group by country, admin1 having count(*) > 1 ;
country | admin1 | count
---------+--------+-------
GE | 00 | 2
IQ | 00 | 4
KH | 00 | 3
KI | 00 | 2
KZ | 00 | 5
LV | 00 | 4
MD | 00 | 17
ME | 00 | 19
MN | 00 | 2
NG | 00 | 7
NP | 00 | 14
PR | 00 | 13
PW | 00 | 11
SB | 00 | 7
ST | 00 | 2
TJ | 00 | 2
TL | 00 | 13
UG | 00 | 8
(18 rows)

select country, admin1, admin2, count(*) from geoname where fclass='A' and fcode='ADM2' and admin2 is not null group by country, admin1, admin2 having count(*) > 1 ;
country | admin1 | admin2 | count
---------+--------+--------+-------
GB | ENG | 00 | 2
GB | ENG | F2 | 2
(2 rows)

If I understand well '00' code is a non valid one : should it not be NULL as we do not have the information ?

Here is a true duplicate :

select * from geoname where fclass='A' and fcode='ADM2' and country='GB' and admin1='ENG' and admin2='F2';

geonameid | name | asciiname | alternatenames | latitude | longitude | fclass | fcode | country | cc2 | admin1 | a
dmin2 | admin3 | admin4 | population | elevation | gtopo30 | timezone | moddate | geom
-----------+---------------------+---------------------+--------------------------------------------------------------------------------------------------------------+------------+------------+--------+-------+---------+-----+--------+--
------+--------+--------+------------+-----------+---------+---------------+------------+----------------------------------------------------
2647554 | Hampshire | Hampshire | Hampshire,Hampshire County,Hants,Khehmpshir,Suthhamtunscir,Sūþhāmtūnscīr,Хэмпшир,ハンプシャー,ハンプシャー州 | 51 | -1.25 | A | ADM2 | GB | GB | ENG
| F2 | | | 1691000 | | 77 | Europe/London | 2008-07-13 | 0101000020E6100000000000000000F4BF0000000000804940
2647555 | County of Hampshire | County of Hampshire | Hampshire | 51.0833333 | -1.1666667 | A | ADM2 | GB | GB | ENG | F
2 | | | 0 | | 77 | Europe/London | 2008-01-16 | 0101000020E6100000DE509DB3AAAAF2BF791563AAAA8A4940
(2 rows)

Some more on 3rd level :

select country, admin1, admin2, admin3, count(*) from geoname
where fclass='A' and fcode='ADM3' and admin3 is not null group by country, admin1, admin2, admin3 having count(*) > 1 ;
country | admin1 | admin2 | admin3 | count
---------+--------+--------+--------+-------
DE | 14 | | 15171 | 2
ES | 31 | MU | 30030 | 2
ES | 34 | O | 33008 | 2
ES | 34 | O | 33032 | 2
ES | 34 | O | 33049 | 2
ES | 58 | OR | 32068 | 2
ES | 59 | BI | 48086 | 2
NO | 12 | 0301 | 13 | 2
(8 rows)

On level 4 I did not fin any.
marc



Joined: 08/12/2005 07:39:47
Messages: 4416
Offline

Hi Nicolas

Do you also have a solution for the duplicates?

Best

Marc

[WWW]
nicolas.helleringer



Joined: 11/08/2009 13:48:39
Messages: 6
Offline

Uppon import in a SGBD i did update all the records with '00' to revert them to NULL

For true duplicates i dropped the oldest record or the lowest geonameid if there were updated on the same date.

As the duplicates are from regions I do not know I do not have better criterias alas
 
Forum Index -> Administrative Divisions
Go to:   
Powered by JForum 2.1.5 © JForum Team