GeoNames Home | Postal Codes | Download / Webservice | About 

GeoNames Forum
  [Search] Search   [Recent Topics] Recent Topics   [Groups] Back to home page 
[Register] Register / 
[Login] Login 
Importing the GeoNames Dump into mysql  XML
Forum Index -> FAQ - frequently asked questions Go to Page: 1, 2 Next 
Author Message
marc



Joined: 08/12/2005 07:39:47
Messages: 4439
Offline

CREATE TABLE `geonames` (
`geonameid` int(10) unsigned NOT NULL default '0',
`name` varchar(200) NOT NULL default '',
`ansiname` varchar(200) NOT NULL default '',
`alternatenames` varchar(2000) NOT NULL default '',
`latitude` double NOT NULL default '0',
`longitude` double NOT NULL default '0',
`feature_class` char(1) ,
`feature_code` varchar(10) ,
`country_code` char(2),
`cc2` varchar(60),
`admin1_code` varchar(20) default '',
`admin2_code` varchar(80) default '',
`admin3_code` varchar(20) default '',
`admin4_code` varchar(20) default '',
`population` bigint(11) default '0',
`elevation` int(11) default '0',
`gtopo30` int(11) default '0',
`timezone` varchar(40),
`modification_date` date default '0000-00-00',
PRIMARY KEY (`geonameid`)
) CHARACTER SET utf8 ;


You will need to index the columns needed by your application in queries.

Edit 2006.07.19 : added cc2
Edit 2006.08.08 : added timezone
Edit 2007.03.24 : added admin2
Edit 2007.07.25 : added admin3 and admin4
Edit 2009.04.13 : alter column population from int to bigint

How to load all tables is described here :
http://forum.geonames.org/gforum/posts/list/732.page

[WWW]
akreider



Joined: 18/07/2006 22:39:21
Messages: 1
Offline

If you want to save some disk space:

geonameid - can be mediumint (at least I think so, this allows for 8 million ids, or 16 million if you make it positive, whereas geonames is more around 4.5 million)

population - mediumint
altitude - smallint
gtopo30 - smallint

latitude/longitude - probably can be float for most uses (the geonames database isn't that precise)
Eddie



Joined: 26/04/2006 13:04:15
Messages: 7
Offline

Hi everyone!

And which table format would you use for best search performance? MyISAM or InnoDB?

Thank you,
Eddie
marc



Joined: 08/12/2005 07:39:47
Messages: 4439
Offline

Hi Eddie

We are using postgres at geonames. This said I guess MyISAM could be the better choice as it is supposed to be faster than InnoDB and I don't think you need transactions or foreign keys for your geonames table.

Cheers

Marc

[WWW]
matthieu



Joined: 12/01/2007 09:29:12
Messages: 1
Offline

after creating the table its quite simple
Code:
 matt@matt-laptop:~/Desktop/IN$ mysql -uroot
 Welcome to the MySQL monitor.  Commands end with ; or \g.
 Your MySQL connection id is 81 to server version: 5.0.21-Debian_3ubuntu1-log
 
 Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
 
 mysql> use geonames;
 Reading table information for completion of table and column names
 You can turn off this feature to get a quicker startup with -A
 
 Database changed
 mysql> load data infile '/home/matt/Desktop/allCountries/allCountries.txt' INTO TABLE geonames;
 Query OK, 6313518 rows affected, 65535 warnings (5 min 21.73 sec)
 Records: 6313518  Deleted: 0  Skipped: 0  Warnings: 5734251
 
karbo



Joined: 07/03/2007 11:21:52
Messages: 1
Location: Norway
Offline

When I try the approach above with the table structure from the first post of this thread I get the following error:

ERROR 1264 (22003): Out of range value adjusted for column 'elevation' at row 1 


Any ideas what I have to do to fix this?

For the first row the elevation value is empty, but since the column don't have the NOT NULL flag set, I don't see how that should be a problem?
[WWW]
zeman



Joined: 30/04/2007 10:46:21
Messages: 2
Offline

I had much better luck using the mysqlimport tool.

mysqlimport -h host -u username -ppassword --fields-terminated-by='\t' --lines-terminated-by='\n' --columns='geonameid, name, ansiname, alternames, latitude, longitude, feature_class, feature_code, country_code, cc2, admin1_code, admin2_code, population, elevation, gtopo30, timezone, modification_date' --local database /path/geonames.txt

Use ftp or wget on the server to grab the dump file, upload it to your server, uncompress and rename to match your table.

Replace the following with your info:
host
username
password (no space between the -p option and password)
database
/path/geonames.txt (path to your file on server)

Important: double check via phpmyadmin that both the collation of the whole database and the table is set to "utf8_general_ci" via the "operations" tab before you import.
luuk34



Joined: 26/05/2007 17:43:43
Messages: 1
Offline

matthieu wrote:
after creating the table its quite simple
Code:
 matt@matt-laptop:~/Desktop/IN$ mysql -uroot
 Welcome to the MySQL monitor.  Commands end with ; or \g.
 Your MySQL connection id is 81 to server version: 5.0.21-Debian_3ubuntu1-log
 
 Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
 
 mysql> use geonames;
 Reading table information for completion of table and column names
 You can turn off this feature to get a quicker startup with -A
 
 Database changed
 mysql> load data infile '/home/matt/Desktop/allCountries/allCountries.txt' INTO TABLE geonames;
 Query OK, 6313518 rows affected, 65535 warnings (5 min 21.73 sec)
 Records: 6313518  Deleted: 0  Skipped: 0  Warnings: 5734251
 
 


mysql> load data infile '/home/luuk/Downloads/geonames/allCountries.txt' into table geonames;
Query OK, 6530255 rows affected, 65535 warnings (1 min 39.69 sec)
Records: 6530255 Deleted: 0 Skipped: 0 Warnings: 4628033

more records, less warnings.... and quicker
but still the same way loaded....

are there also scripts for other tables like
- admin1Codes
- admin2Codes
- countryInfo
- featureCodes
- iso-languagecodes
- timeZones
RaggaDee



Joined: 19/06/2007 13:00:22
Messages: 2
Offline

Hi there,

these posts did help a lot in order to mySQLize thegeonames. BUT i still have a problem:

When i imported the file into the geonames table which i created with the CREATE from the start of this post, there are broken string entries like in "Zwölfing" or other german "umlaute".

So i did save the txt file as UTF8 encoded file on my limux development machine again but it didn't help.

any hints what i forgot??
thanks

dag
marc



Joined: 08/12/2005 07:39:47
Messages: 4439
Offline

Hi RaggaDee

Our files are in UTF8. If you are using the create statement above you should be ok. It could be a problem with the way you look at the data. You will have to set some display parameter to tell your display tool that you want to look at UTF8.

Marc

[WWW]
RaggaDee



Joined: 19/06/2007 13:00:22
Messages: 2
Offline

Hi Marc, thnaks a lot for your answer. I still watched them in phpMyAdmin... maybe i find there a display param, too.

RgD
goose



Joined: 09/10/2007 16:46:08
Messages: 3
Offline

hello there,
i lost couple of hours before i found out why i got this
Code:
  error ERROR 1366 (HY000): Incorrect integer value: '' for column 'elevation' at row 1.
 


Well it means what it says!!! my integer cant have this value '' in fact when you are under mysql5. So under mysql5 read this link it might help.
http://bugs.mysql.com/bug.php?id=18551

Cheers.
rfay



Joined: 14/10/2007 01:00:50
Messages: 8
Offline

The fields have changed since the last (helpful) post on how to do a mysqlimport, so now this will work:

Code:
mysqlimport -h <yourhost> -u <yourusername>--default-character-set=utf8 
 --fields-terminated-by='\t' --lines-terminated-by='\n' 
 --columns='geonameid, name, ansiname, alternames, latitude, longitude, 
 feature_class, feature_code, country_code, cc2, admin1_code, 
 admin2_code, admin3_code,admin4_code,population, elevation, gtopo30, 
 timezone, modification_date' --local <your_geonames_db> allcountries.txt
 


 Description MaIn geonames file import [Disk] Download
 Filesize 390 bytes
 Downloaded:  63395 time(s)

bytte



Joined: 07/01/2008 17:56:11
Messages: 15
Offline

I found that varchar(200) is not sufficient for the "alternames" field in some cases, such as Brussels, Belgium. So you might want to change that to varchar(500) or something like that.
Q



Joined: 23/01/2008 14:26:58
Messages: 3
Offline

Hi,

i've tried to load dump but it throw me this error and i don't know what is it.



Code:
Error
 
 SQL query:
 
 LOAD DATA INFILE '/data/allCountries.txt' INTO TABLE geoname(
 geonameid, name, asciiname, alternatenames, latitude, longitude, fclass, fcode, country, cc2, admin1, admin2, admin3, admin4, population, elevation, gtopo30, timezone, moddate
 );
 
 MySQL said: Documentation
 #13 - Can't get stat of '/data/allCountries.txt' (Errcode: 2) 
 


can you help me?
marc



Joined: 08/12/2005 07:39:47
Messages: 4439
Offline

Is the file path correct? Is the file readable? Is it really in /data/ and not in data/?

Cheers

Marc

[WWW]
Q



Joined: 23/01/2008 14:26:58
Messages: 3
Offline

...maybe you right.
jbarbede



Joined: 30/08/2008 18:15:04
Messages: 1
Offline

Like it seems the method to insert the geonames database in MySQL was not updated, I propose an actualized version to insert the geonames database in UTF8. I made it with the geonames files downloaded the 21 November 2008 and the MySQL version used is the 5.0.67. All comments are welcome.

1. Create a temporary folder on your hard disk

2. Download the following elements and save them in your temporary folder:
- http://download.geonames.org/export/dump/allCountries.zip
- http://download.geonames.org/export/dump/alternateNames.zip
- http://download.geonames.org/export/dump/admin1Codes.txt
- http://download.geonames.org/export/dump/admin1CodesASCII.txt
- http://download.geonames.org/export/dump/admin2Codes.txt
- http://download.geonames.org/export/dump/featureCodes_en.txt
- http://download.geonames.org/export/dump/timeZones.txt
- http://download.geonames.org/export/dump/countryInfo.txt

3. Unzip allCountries.zip in your temporary folder

4. Unzip alternateNames.zip in your temporary folder. There are two files: iso-languagecodes.txt and alternateNames.txt

5. Check all files to remove commented and/or useless lines. There are commented and/or useless lines at least in countryInfo.txt, iso-languagecodes.txt and timeZones.txt to the beginning of each file.

6. Create the continentCodes.txt file with the following content in your temporary folder:

AF,Africa,6255146
AS,Asia,6255147
EU,Europe,6255148
NA,North America,6255149
OC,Oceania,6255151
SA,South America,6255150
AN,Antarctica,6255152

7. Download the import.sql file attached to this post and save it in your temporary folder. Create the database, tables and import data due to the import.sql file. The database and inserted data will be in UTF8.

mysql -u user -p < import.sql (of course replace "user" by an existing MySQL user in your system and you need to have your temporary folder like current folder to execute this command).

10 tables will be created.

10 text files will be used to load 9 049 392 lines:
- allCountries.txt: 6,674,517 lines
- alternateNames.txt: 2,351,166 lines
- admin1Codes.txt: 4,578 lines
- admin1CodesASCII.txt: 3,540 lines
- admin2Codes.txt: 11,396 lines
- continentCodes.txt: 7 lines
- countryInfo.txt: 247 lines
- featureCodes.txt: 664 lines
- iso-languagecodes.txt: 7600 lines
- timeZones.txt: 399 lines

It represents approximatively 800 Mo.

8. Remember to add the indexes you need.

9. You have your database created and ready to use.

TODO: geoname.cc2 : alternate country codes, comma separated, ISO-3166 2-letter country code, 60 characters --> Move this information to a separate table. It's a todo in the initial version.

Edit 18. May 2009, Marc Wick: changed population to bigint.
 Description The import.sql file to create the database and load data. [Disk] Download
 Filesize 4 Kbytes
 Downloaded:  65502 time(s)

garaki



Joined: 19/12/2007 23:57:44
Messages: 5
Offline

Alright all the tables are created.. now how do i query it to get a zip code for city called 'Dari' in India

thanks
SEO



Joined: 14/10/2008 18:58:20
Messages: 18
Offline

marc wrote:
Hi Eddie

We are using postgres at geonames. This said I guess MyISAM could be the better choice as it is supposed to be faster than InnoDB and I don't think you need transactions or foreign keys for your geonames table.

Cheers

Marc 


Hi there!

Your message is from 2006, is this still the case for the city database? Will MyIsam work better then InnoDB?

As it seems from test results on for example the MySQL performance blog the innodb has become much (30% in general) faster then MyIsam, especially when selecting on geonameid (primary id) or sorting by latitude longitude (<> ) .

http://www.mysqlperformanceblog.com/2007/01/08/innodb-vs-myisam-vs-falcon-benchmarks-part-1/

What should we choose? Does anyone have experience with innodb as storage engine for the city database?
 
Forum Index -> FAQ - frequently asked questions Go to Page: 1, 2 Next 
Go to:   
Powered by JForum 2.1.5 © JForum Team