Author |
Message |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 20/03/2012 22:39:12
|
sabv
Joined: 10/11/2011 05:35:19
Messages: 4
Offline
|
Hi,
I am trying to extract a list of cities from geonames main table. I can not find a way to have consistent results which will give me real cities or villages as opposed to "populated places".
I tried query with feature_class='P' and feature_code = 'PPL' but this list has more than only cities and villages. It also contains small localities within the city.
I also tried cities1000.txt download but that file does not contain all cities with 1000 or more people. For example, it does not contain Richmond Hill in Ontario which is a city of more than 18,000 people.
Is there a reliable way to get list of cities from geonames? We are simply trying to create a drop down of cities based on province or state selected.
Thanks for your help.
|
|
 |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 21/03/2012 06:57:36
|
Thomas Gutzmann
Joined: 19/03/2012 15:23:39
Messages: 3
Offline
|
Have a look at
http://www.geonames.org/export/codes.html
and scroll down to "P city, village,...".
You can identify cities by feature class "P", feature code "PPLS". You can use population to exclude small villages.
Cheers,
Thomas
|
|
 |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 22/03/2012 07:23:31
|
sabv
Joined: 10/11/2011 05:35:19
Messages: 4
Offline
|
Thanks,
I had examined the codes before but it seems that the population data is not populated for all places. I get only 3 records from geonames when I use the condition
... where feature_class = 'P' and feature_code='PPLS'
when I use feature_class='P' and feature_code='PPL', it includes small localities within the city as well.
In many cases, population column is blank.
|
|
 |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 29/03/2012 06:13:47
|
Thomas Gutzmann
Joined: 19/03/2012 15:23:39
Messages: 3
Offline
|
Hi,
It looks as if the answer is not a simple SQL statement.
Let's have a look:
I created a table "geo_all_cities" from all entries of "all countries" with feature_class = 'A' or 'P'.
Looking for Richmond Hill in Canada, I get:
Code:
select geonameid, name, feature_code, population
from geo_all_cities
where upper(name) like '%RICHMOND%' and
upper(name) like '%HILL%'
and country_code = 'CA';
GEONAMEID NAME FEATUR POPULATION
--------- -------------------- ------ ------------
6122091 Richmond Hill PPL 185,541
For my birth place in Germany it's different:
Code:
select geonameid, name, feature_code, population, substr(alternatenames, 1, 20) as alternate, modification_date as mdate
from geo_all_cities
where upper(name) like '%LÜBECK%'
and country_code = 'DE';
GEONAMEID NAME FEATUR POPULATION ALTERNATE MDATE
--------- ------------------------------ ------ ------------ -------------------- ----------
3249071 Kreisfreie Stadt Lübeck ADM3 209,818 2010-11-24
7602559 Lübeck, Hansestadt ADM4 209,818 2010-11-24
2875601 Lübeck PPLA3 212,207 Limpek,Ljubek,Lubeca 2011-04-27
2957026 Alt Lübeck PPLQ 0 2011-04-26
7778797 Lübeck St. Gertrud PPLX 0 2012-03-09
In this case, the third entry with feature_code "PPLA3" is correct, but the first and second would do as well, though the names are the administrative versions which are not used normally.
Now let's look for Berlin in Germany. I know that there are two locations called "Berlin", one is the capital, the other a very small village. The villagers enjoyed themselves by giving their streets the same names as very popular streets and places in the capital, like "Kurfürstendamm" and "Potsdamer Platz"
I'm using "like" here instead of a regular expression, so I have deleted some result rows manually.
Code:
select geonameid, name, feature_code, population, substr(alternatenames, 1, 20) as alternate,
modification_date as mdate, latitude, longitude
from geo_all_cities
where name like '%Berlin%'
and country_code = 'DE';
GEONAMEID NAME FEATUR POPULATION ALTERNATE MDATE LATITUDE LONGITUDE
--------- ------------------------------ ------ ------------ -------------------- ---------- -------------------------------- --------------------------------
2950157 Land Berlin ADM1 3,442,675 Baarliin,Baelliin,Ba 2012-01-16 52.5 13.41667
6547383 Berlin, Stadt ADM3 3,442,675 2010-11-24 52.5233 13.41377
6551168 Bernau bei Berlin, Stadt ADM4 36,154 2010-11-24 52.6821 13.5965
6547539 Berlin, Stadt ADM4 3,442,675 2010-11-24 52.5233 13.41377
6547610 Neuenhagen bei Berlin ADM4 16,755 2010-11-24 52.5333 13.6833
6547614 Rüdersdorf bei Berlin ADM4 15,344 2010-11-24 52.4707 13.7736
6547647 Schöneiche bei Berlin ADM4 12,111 2010-11-24 52.4667 13.6833
6944048 Berlin-Neuköln ADMD 0 2009-07-30 52.4808 13.425
2950096 Bernau bei Berlin PPL 34,866 Bernau,Бернау 2010-01-29 52.67982 13.58708
2950142 Berliner Vorstadt PPL 0 2011-04-12 52.4119 13.08317
2950158 Berlin PPL 0 Berlin 2012-01-18 54.03333 10.45
2950159 Berlin PPLC 3,426,354 BER,Beirlin,Beirlín, 2011-12-03 52.52437 13.41053
2885657 Berlin Köpenick PPLX 59,561 Berlin-Kopenick,Berl 2010-11-22 52.44254 13.58228
2878102 Berlin Lichtenberg PPLX 0 Berlin-Lichtenberg,L 2010-11-22 52.51395 13.49975
2855598 Berlin Pankow PPLX 356,019 Pankow 2010-11-22 52.56926 13.40186
2848756 Berlin Reinickendorf PPLX 241,756 Reinickendorf 2010-05-14 52.56667 13.33333
7290245 Berlin Steglitz Zehlendorf PPLX 290,506 Berlin Steglitz,Berl 2010-05-14 52.43485 13.24183
7290251 Berlin Wilmersdorf PPLX 317,190 Wilmersdorf 2010-05-14 52.50097 13.29097
7290252 Berlin Spandau PPLX 223,862 Spandau 2010-05-14 52.5511 13.19921
7290253 Berlin Tempelhof PPLX 54,629 Tempelhof 2010-05-14 52.47693 13.41027
7290254 Berlin Schöneberg PPLX 116,286 Schoeneberg,Schönebe 2010-05-14 52.47457 13.34839
7290255 Berlin Treptow PPLX 50,000 Treptow 2010-05-14 52.49376 13.44469
6545310 Berlin Mitte PPLX 329,078 Mitte 2010-11-22 52.52003 13.40489
Here the correct entry is the one with geonameid = 2950159 and feature_code = 'PPLC'.
Last example: our very small village in Sweden, Häggnäset near Gäddede in Jämtland. There live about 10 people and 12 dogs. I'm including the neighbor village to make sure I hit the correct place.
Code:
select geonameid, name, feature_code, population, substr(alternatenames, 1, 20) as alternate,
modification_date as mdate, latitude, longitude
from geo_all_cities
where (name like '%Häggnäset%' or name like '%Gäddede%')
and country_code = 'SE';
GEONAMEID NAME FEATUR POPULATION ALTERNATE MDATE LATITUDE LONGITUDE
--------- ------------------------------ ------ ------------ -------------------- ---------- -------------------------------- --------------------------------
2709155 Häggnäset PPL 0 2011-10-30 64.4 14.31667
2713294 Gäddede PPL 456 2011-10-30 64.5 14.15
It follows:
- The correct place must have feature_class = 'P'.
- If there is only one hit (like Richmond Hill and Hägnäset), it's the right one.
- If there are more hits, it should be chosen based on feature_code priority chain PPLC, PPLA, PPLA2, PPLA3, PPLA4, PPL and the rest. The one with the highest priority is the correct one.
This cannot be done in a simple SQL statement. For Oracle, you should create a procedure returning a pipelined table type, optionally wrapped in a view.
This solution is an educated guess only. I hope there is someone who can shed more light on the matter.
Cheers,
Thomas Gutzmann
|
|
 |
|
|
|