Author |
Message |
20/12/2007 15:21:08
|
junto
Joined: 20/12/2007 14:56:01
Messages: 3
Offline
|
I have imported the Geonames data and the Admin1Codes. I notice that the Admin1Codes are prefixed with the 2 letter country code first, hence 'GB.Z1', whilst in the Geonames data the Admin1Code column is not prefixed "Z1".
In this case, it becomes rather tricky to join one table to the other. Ideally, you would want:
SELECT Geonames[Name], Admin1Codes.[AsciiName] FROM GeoNames JOIN Admin1Codes ON Admin1Codes.[Name] = Geonames.[Admin1Code] WHERE GeoNames.CountryCode = 'GB'
However, you can't do the join like this because one table contains simply "Z1" and the other "GB.Z1".
Was there any particular reason for this, or am I missing the point? How can I return the actual name of the administrative division in a query joined to the geoname data?
Regards
Ben
UPDATE: You can of course do this:
SELECT * FROM GeoNames JOIN Admin1Codes ON Admin1Codes.[Name] = GeoNames.[CountryCode] + '.' + GeoNames.[Admin1Code] AND GeoNames.[CountryCode] = 'GB'
But, uuuurgh!
|
|
|
21/12/2007 14:31:13
|
marc
Joined: 08/12/2005 07:39:47
Messages: 4412
Offline
|
Hi Ben
We are not using a table 'admincodes' at all. The information is redundant to the geoname table and only included in the dump because users have been asking for it.
Why don't you just add a new column or update the existing column to fit your needs?
In the admincodes.txt file we have to include the country to make the entries unique and in the geoname table we don't want to keep the concatenation of countrycodes and admincodes to make it easier to modify them and work with them.
Cheers
Marc
|
|
|
|
21/12/2007 20:11:43
|
junto
Joined: 20/12/2007 14:56:01
Messages: 3
Offline
|
marc wrote:
Hi Ben
We are not using a table 'admincodes' at all. The information is redundant to the geoname table and only included in the dump because users have been asking for it.
Why don't you just add a new column or update the existing column to fit your needs?
In the admincodes.txt file we have to include the country to make the entries unique and in the geoname table we don't want to keep the concatenation of countrycodes and admincodes to make it easier to modify them and work with them.
Cheers
Marc
Thanks Marc,
Hence, I assume I can get the regions via the Admin1Code column in the Geonames table with a query such as:
CountryCode = 'GB' AND FeatureClass='A' and FeatureCode='ADM1'
Is that right?
Regards
Ben
|
|
|
22/12/2007 12:04:41
|
marc
Joined: 08/12/2005 07:39:47
Messages: 4412
Offline
|
junto wrote:
Hence, I assume I can get the regions via the Admin1Code column in the Geonames table with a query such as:
CountryCode = 'GB' AND FeatureClass='A' and FeatureCode='ADM1'
Is that right?
Exactly. This is the idea of the ADM1 feature code.
Cheers
Marc
|
|
|
|
09/02/2008 05:51:47
|
mikehenderson
Joined: 22/01/2008 04:13:47
Messages: 3
Location: California
Offline
|
I'm trying to structure the geoname data thus:
continent > country > ADM1 > ADM2 > ADM3 > location
I'm bumping into wrinkles in the data integrity of the date sourced from geonames:
I'm stalled on the following:
select * from geoname where country = 'YE' and fcode='ADM1' and admin1 = '00'
yields 4 rows. Shouldn't the admin1 codes be unique for all ADM1 rows in the same country?
Which row is the correct row?
|
|
|
09/02/2008 07:20:11
|
mikehenderson
Joined: 22/01/2008 04:13:47
Messages: 3
Location: California
Offline
|
select * from geoname where country = 'UG' and admin1 = '00' and fcode = 'ADM1'
yields 8 rows, with very different names, I assume that they are all top level administrative districts in the same country.Shouldn't they have distinct codes?
|
|
|
09/02/2008 12:13:37
|
marc
Joined: 08/12/2005 07:39:47
Messages: 4412
Offline
|
Hi Mike
Yes, they should have distinct codes. Would you like to help us fix the problem?
Marc
|
|
|
|
09/02/2008 19:46:13
|
mikehenderson
Joined: 22/01/2008 04:13:47
Messages: 3
Location: California
Offline
|
Nest I can do right now is find out how big the problem is, this sql:
select
country, admin1, count(*)
from
geoname
where
fclass = 'A' and
fcode ='ADM1'
group by
country,
admin1
having count(*) > 1
gives these results
BT 15 2
HU 08 2
HU 16 2
KH 00 3
KH 16 2
KI 00 2
KZ 00 5
LV 00 4
MD 00 17
ME 00 19
MN 00 2
NG 00 7
NP 00 14
PH B2 2
PH C8 2
PH E1 2
PL 00 7
RU 90 2
SB 00 7
SI 00 5
ST 00 2
TJ 00 2
TL 00 13
UG 00 8
YE 00 4
I added an index on the geoname columns: country, fcode, fclass, admin1, admin2, admin3 so the query did not take too long.
The last column is the number of duplicates of the admin1 code in each country.
The best fix I can come up with is for the '00' admin1 codes is to try and write a statement to set them to unique values.
However there is no automatic fix for any ADM2 or ADM3 or place rows that have been put into a duplicated admin1 sub-division these would have to be repaired by hand.
The best fix is to prevent this data getting into the table perhaps a unique index on name, country, fcode, fclass, fcode, admin1, admin2, admin3 would work?
|
|
|
10/02/2008 11:01:38
|
marc
Joined: 08/12/2005 07:39:47
Messages: 4412
Offline
|
Mike
The code '00' stands for 'we don't know the official code'. If we set it to some random number we don't gain anything, at the contrary we destroy the information that there is some thing to be fixed.
The code we are looking is for is the FIPS code and if there is not FIPS code we might use the ISO 3166-2 code.
What you can do is :
1. What are the first order administrative divisions of the country.
2. Do they match with what GeoNames has in the database.
3. What codes do exist?
4. Send us a list with the geonamesid and the fipscode (or ISO code)
You are free to delete the not unique administrative divisions from your database.
We prefer to have missing admin codes instead of missing administrative divisions in our database. It is not perfect but it is better than nothing.
Marc
|
|
|
|
|