GeoNames Home | Postal Codes | Download / Webservice | About 

GeoNames Forum
  [Search] Search   [Recent Topics] Recent Topics   [Groups] Back to home page 
[Register] Register / 
[Login] Login 
How to Identify cities in geonames?  XML
Forum Index -> General
Author Message
sabv



Joined: 10/11/2011 05:35:19
Messages: 4
Offline

Hi,

I am trying to extract a list of cities from geonames main table. I can not find a way to have consistent results which will give me real cities or villages as opposed to "populated places".

I tried query with feature_class='P' and feature_code = 'PPL' but this list has more than only cities and villages. It also contains small localities within the city.

I also tried cities1000.txt download but that file does not contain all cities with 1000 or more people. For example, it does not contain Richmond Hill in Ontario which is a city of more than 18,000 people.

Is there a reliable way to get list of cities from geonames? We are simply trying to create a drop down of cities based on province or state selected.

Thanks for your help.
Thomas Gutzmann



Joined: 19/03/2012 15:23:39
Messages: 3
Offline

Have a look at

http://www.geonames.org/export/codes.html

and scroll down to "P city, village,...".

You can identify cities by feature class "P", feature code "PPLS". You can use population to exclude small villages.

Cheers,

Thomas
[WWW]
sabv



Joined: 10/11/2011 05:35:19
Messages: 4
Offline

Thanks,

I had examined the codes before but it seems that the population data is not populated for all places. I get only 3 records from geonames when I use the condition

... where feature_class = 'P' and feature_code='PPLS'

when I use feature_class='P' and feature_code='PPL', it includes small localities within the city as well.

In many cases, population column is blank.
Thomas Gutzmann



Joined: 19/03/2012 15:23:39
Messages: 3
Offline

Hi,

It looks as if the answer is not a simple SQL statement.

Let's have a look:

I created a table "geo_all_cities" from all entries of "all countries" with feature_class = 'A' or 'P'.

Looking for Richmond Hill in Canada, I get:

Code:
 select geonameid, name, feature_code, population 
   from geo_all_cities 
   where upper(name) like '%RICHMOND%' and 
     upper(name) like '%HILL%' 
     and country_code = 'CA';
 
 GEONAMEID NAME                 FEATUR   POPULATION
 --------- -------------------- ------ ------------
   6122091 Richmond Hill        PPL         185,541 
 


For my birth place in Germany it's different:

Code:
 select geonameid, name, feature_code, population, substr(alternatenames, 1, 20) as alternate, modification_date as mdate 
   from geo_all_cities 
   where upper(name) like '%LÜBECK%'
     and country_code = 'DE';
 
 GEONAMEID NAME                           FEATUR   POPULATION ALTERNATE            MDATE    
 --------- ------------------------------ ------ ------------ -------------------- ----------
   3249071 Kreisfreie Stadt Lübeck        ADM3        209,818                      2010-11-24 
   7602559 Lübeck, Hansestadt             ADM4        209,818                      2010-11-24 
   2875601 Lübeck                         PPLA3       212,207 Limpek,Ljubek,Lubeca 2011-04-27 
   2957026 Alt Lübeck                     PPLQ              0                      2011-04-26 
   7778797 Lübeck St. Gertrud             PPLX              0                      2012-03-09 
 


In this case, the third entry with feature_code "PPLA3" is correct, but the first and second would do as well, though the names are the administrative versions which are not used normally.

Now let's look for Berlin in Germany. I know that there are two locations called "Berlin", one is the capital, the other a very small village. The villagers enjoyed themselves by giving their streets the same names as very popular streets and places in the capital, like "Kurfürstendamm" and "Potsdamer Platz"

I'm using "like" here instead of a regular expression, so I have deleted some result rows manually.

Code:
select geonameid, name, feature_code, population, substr(alternatenames, 1, 20) as alternate, 
     modification_date as mdate, latitude, longitude 
   from geo_all_cities 
   where name like '%Berlin%'
     and country_code = 'DE';
 
 GEONAMEID NAME                           FEATUR   POPULATION ALTERNATE            MDATE      LATITUDE                         LONGITUDE                      
 --------- ------------------------------ ------ ------------ -------------------- ---------- -------------------------------- --------------------------------
   2950157 Land Berlin                    ADM1      3,442,675 Baarliin,Baelliin,Ba 2012-01-16 52.5                             13.41667                         
   6547383 Berlin, Stadt                  ADM3      3,442,675                      2010-11-24 52.5233                          13.41377                         
   6551168 Bernau bei Berlin, Stadt       ADM4         36,154                      2010-11-24 52.6821                          13.5965                          
   6547539 Berlin, Stadt                  ADM4      3,442,675                      2010-11-24 52.5233                          13.41377                         
   6547610 Neuenhagen bei Berlin          ADM4         16,755                      2010-11-24 52.5333                          13.6833                          
   6547614 Rüdersdorf bei Berlin          ADM4         15,344                      2010-11-24 52.4707                          13.7736                          
   6547647 Schöneiche bei Berlin          ADM4         12,111                      2010-11-24 52.4667                          13.6833                          
   6944048 Berlin-Neuköln                 ADMD              0                      2009-07-30 52.4808                          13.425                           
   2950096 Bernau bei Berlin              PPL          34,866 Bernau,Бернау        2010-01-29 52.67982                         13.58708                         
   2950142 Berliner Vorstadt              PPL               0                      2011-04-12 52.4119                          13.08317                         
   2950158 Berlin                         PPL               0 Berlin               2012-01-18 54.03333                         10.45                            
   2950159 Berlin                         PPLC      3,426,354 BER,Beirlin,Beirlín, 2011-12-03 52.52437                         13.41053                         
   2885657 Berlin Köpenick                PPLX         59,561 Berlin-Kopenick,Berl 2010-11-22 52.44254                         13.58228                         
   2878102 Berlin Lichtenberg             PPLX              0 Berlin-Lichtenberg,L 2010-11-22 52.51395                         13.49975                         
   2855598 Berlin Pankow                  PPLX        356,019 Pankow               2010-11-22 52.56926                         13.40186                         
   2848756 Berlin Reinickendorf           PPLX        241,756 Reinickendorf        2010-05-14 52.56667                         13.33333                         
   7290245 Berlin Steglitz Zehlendorf     PPLX        290,506 Berlin Steglitz,Berl 2010-05-14 52.43485                         13.24183                         
   7290251 Berlin Wilmersdorf             PPLX        317,190 Wilmersdorf          2010-05-14 52.50097                         13.29097                         
   7290252 Berlin Spandau                 PPLX        223,862 Spandau              2010-05-14 52.5511                          13.19921                         
   7290253 Berlin Tempelhof               PPLX         54,629 Tempelhof            2010-05-14 52.47693                         13.41027                         
   7290254 Berlin Schöneberg              PPLX        116,286 Schoeneberg,Schönebe 2010-05-14 52.47457                         13.34839                         
   7290255 Berlin Treptow                 PPLX         50,000 Treptow              2010-05-14 52.49376                         13.44469                         
   6545310 Berlin Mitte                   PPLX        329,078 Mitte                2010-11-22 52.52003                         13.40489                         
 

Here the correct entry is the one with geonameid = 2950159 and feature_code = 'PPLC'.

Last example: our very small village in Sweden, Häggnäset near Gäddede in Jämtland. There live about 10 people and 12 dogs. I'm including the neighbor village to make sure I hit the correct place.

Code:
select geonameid, name, feature_code, population, substr(alternatenames, 1, 20) as alternate, 
     modification_date as mdate, latitude, longitude 
   from geo_all_cities 
   where (name like '%Häggnäset%' or name like '%Gäddede%')
     and country_code = 'SE';
 
 GEONAMEID NAME                           FEATUR   POPULATION ALTERNATE            MDATE      LATITUDE                         LONGITUDE                      
 --------- ------------------------------ ------ ------------ -------------------- ---------- -------------------------------- --------------------------------
   2709155 Häggnäset                      PPL               0                      2011-10-30 64.4                             14.31667                         
   2713294 Gäddede                        PPL             456                      2011-10-30 64.5                             14.15                            
 


It follows:

- The correct place must have feature_class = 'P'.
- If there is only one hit (like Richmond Hill and Hägnäset), it's the right one.
- If there are more hits, it should be chosen based on feature_code priority chain PPLC, PPLA, PPLA2, PPLA3, PPLA4, PPL and the rest. The one with the highest priority is the correct one.

This cannot be done in a simple SQL statement. For Oracle, you should create a procedure returning a pipelined table type, optionally wrapped in a view.

This solution is an educated guess only. I hope there is someone who can shed more light on the matter.

Cheers,

Thomas Gutzmann
[WWW]
 
Forum Index -> General
Go to:   
Powered by JForum 2.1.5 © JForum Team