Author |
Message |
04/10/2010 09:33:34
|
supercain
Joined: 01/10/2010 18:43:55
Messages: 4
Offline
|
And what about the cities5000.txt database and so on? what tables do i need to create?
|
|
|
25/10/2010 19:33:19
|
Tab
Joined: 13/08/2010 19:44:27
Messages: 23
Offline
|
Hi,
I'm try for months but without success.
I utilise Mysql with Xampp on Windows 7 and I seen many commands run under Linux system.
It's possibile to have a little how-to which helps to import alla datas?
Another question: I seen, for example, in italy nation more cities and regions are in english language. How is possible to have every city in the correct language spoken by the associated nation?
Thanks.
|
|
|
25/10/2010 20:56:15
|
marc
Joined: 08/12/2005 07:39:47
Messages: 4416
Offline
|
the other languages are in the file 'alternateNames.zip'.
Marc
|
|
|
|
25/10/2010 23:44:19
|
Tab
Joined: 13/08/2010 19:44:27
Messages: 23
Offline
|
And How I must to do do assign every language to its nation?
|
|
|
26/10/2010 06:43:57
|
marc
Joined: 08/12/2005 07:39:47
Messages: 4416
Offline
|
This is impossible. There is no 1:1 relation between languages and countries.
The file countryInfo.txt contains a list of languages spoken in each country.
Marc
|
|
|
|
24/11/2010 23:02:44
|
Blitz0r
Joined: 24/11/2010 22:58:32
Messages: 1
Offline
|
First up, thanks to everyone in this thread... what i post is what took me a while to figure out (character set in load data and some other errors in the original script (like capital I in countryInfo etc).
save as import.sql (update)
Code:
CREATE DATABASE geonames;
USE geonames;
CREATE TABLE geoname (
geonameid int PRIMARY KEY,
name varchar(200),
asciiname varchar(200),
alternatenames varchar(4000),
latitude decimal(10,7),
longitude decimal(10,7),
fclass char(1),
fcode varchar(10),
country varchar(2),
cc2 varchar(60),
admin1 varchar(20),
admin2 varchar(80),
admin3 varchar(20),
admin4 varchar(20),
population int,
elevation int,
gtopo30 int,
timezone varchar(40),
moddate date
) CHARACTER SET utf8;
CREATE TABLE alternatename (
alternatenameId int PRIMARY KEY,
geonameid int,
isoLanguage varchar(7),
alternateName varchar(200),
isPreferredName boolean,
isShortName boolean,
isColloquial boolean,
isHistoric boolean
) CHARACTER SET utf8;
CREATE TABLE countryinfo (
iso_alpha2 char(2),
iso_alpha3 char(3),
iso_numeric integer,
fips_code varchar(3),
name varchar(200),
capital varchar(200),
areainsqkm double,
population integer,
continent char(2),
tld char(3),
currency char(3),
currencyName char(20),
Phone char(10),
postalCodeFormat char(20),
postalCodeRegex char(20),
geonameId int,
languages varchar(200),
neighbours char(20),
equivalentFipsCode char(10)
) CHARACTER SET utf8;
CREATE TABLE iso_languagecodes(
iso_639_3 CHAR(4),
iso_639_2 VARCHAR(50),
iso_639_1 VARCHAR(50),
language_name VARCHAR(200)
) CHARACTER SET utf8;
CREATE TABLE admin1Codes (
code CHAR(6),
name TEXT
) CHARACTER SET utf8;
CREATE TABLE admin1CodesAscii (
code CHAR(6),
name TEXT,
nameAscii TEXT,
geonameid int
) CHARACTER SET utf8;
CREATE TABLE featureCodes (
code CHAR(7),
name VARCHAR(200),
description TEXT
) CHARACTER SET utf8;
CREATE TABLE timeZones (
timeZoneId VARCHAR(200),
GMT_offset DECIMAL(3,1),
DST_offset DECIMAL(3,1)
) CHARACTER SET utf8;
CREATE TABLE continentCodes (
code CHAR(2),
name VARCHAR(20),
geonameid INT
) CHARACTER SET utf8;
LOAD DATA LOCAL INFILE 'allCountries.txt' INTO TABLE geoname CHARACTER SET 'UTF8' (geonameid,name,asciiname,alternatenames,latitude,longitude,fclass,fcode,country,cc2, admin1,admin2,admin3,admin4,population,elevation,gtopo30,timezone,moddate);
LOAD DATA LOCAL INFILE 'alternateNames.txt' INTO TABLE alternatename CHARACTER SET 'UTF8' (alternatenameid,geonameid,isoLanguage,alternateName,isPreferredName,isShortName,isColloquial,isHistoric);
LOAD DATA LOCAL INFILE 'iso-languagecodes.txt' INTO TABLE iso_languagecodes CHARACTER SET 'UTF8' (iso_639_3, iso_639_2, iso_639_1, language_name);
LOAD DATA LOCAL INFILE 'admin1Codes.txt' INTO TABLE admin1Codes CHARACTER SET 'UTF8' (code, name);
LOAD DATA LOCAL INFILE 'admin1CodesAscii.txt' INTO TABLE admin1CodesAscii CHARACTER SET 'UTF8' (code, name, nameAscii, geonameid);
LOAD DATA LOCAL INFILE 'featureCodes.txt' INTO TABLE featureCodes CHARACTER SET 'UTF8' (code, name, description);
LOAD DATA LOCAL INFILE 'timeZones.txt' INTO TABLE timeZones CHARACTER SET 'UTF8' IGNORE 1 LINES (timeZoneId, GMT_offset, DST_offset);
LOAD DATA LOCAL INFILE 'countryInfo-n.txt' INTO TABLE countryInfo CHARACTER SET 'UTF8' IGNORE 1 LINES (iso_alpha2,iso_alpha3,iso_numeric,fips_code,name,capital,areaInSqKm,population,continent,languages,currency,geonameId);
LOAD DATA LOCAL INFILE 'continentCodes.txt' INTO TABLE continentCodes CHARACTER SET 'UTF8' FIELDS TERMINATED BY ',' (code, name, geonameId);
run locally via ssh or console from the folder with the .txt files
> mysql -u root < import.sql -p
(change root accordingly) (-p is only if you have a password set)
Good luck!
|
|
|
26/05/2011 12:08:23
|
muser83
Joined: 26/05/2011 11:54:09
Messages: 10
Offline
|
giorgio79 wrote:
Here is the structure for admin2codes
Code:
CREATE TABLE `geonames_admin2codes` (
`code` char(10) default NULL,
`name_local` text,
`name` text NOT NULL,
`geonameid` int(11) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
Some codes become truncated with char(10). We suggest char(40) or more
|
|
|
06/07/2011 01:22:25
|
scd8418
Joined: 06/07/2011 01:20:50
Messages: 1
Offline
|
To correctly load the data as utf8 data, I had to add the character set specification to the load data file command as indicated below.
LOAD DATA INFILE 'data/geonames.org/allCountries.txt' INTO TABLE geoname CHARACTER SET UTF8 (geoname_id,name,ascii_name,alternate_names,lat,lon,fclass,fcode,country,cc2,admin1,admin2,admin3,admin4,population,alt,gtopo30,timezone,mod_date);
|
|
|
15/10/2011 12:21:59
|
HermesTrismegistus
Joined: 13/10/2011 14:06:50
Messages: 3
Offline
|
When i tried tried to dump the countryInfo.txt file, something went wrong. The table creation as described above here, is not the right table. It seems the countryInfo.txt file has been updated, so this won't work anymore.
Is anyone able to recreate this table? I am absolutely not an sql expert....
|
|
|
26/05/2012 17:22:54
|
AndyBG
Joined: 26/05/2012 15:28:09
Messages: 3
Offline
|
The countryInfo is structured differently from the contents of countryInfo.txt. The column positions for 'languages' and 'geonameId' need to be swapped around.
Also, the file 'countryInfo.txt' has a whole lot of comments that MySql is happy to turn into corrupted rows. My advice is to get rid of all but the last of the leading commented lines, so that the file starts thus:
Code:
#ISO ISO3 ISO-Numeric fips Country Capital Area(in sq km) Population Continent tld CurrencyCode CurrencyName Phone Postal Code Format Postal Code Regex Languages geonameid neighbours EquivalentFipsCode
AD AND 020 AN Andorra Andorra la Vella 468 84000 EU .ad EUR Euro 376 AD### ^(?:AD)*(\d{3})$ ca 3041565 ES,FR
Following, are the corrected TABLE & LOAD DATA commands (as of the date of this posting):
Code:
CREATE TABLE countryinfo (
iso_alpha2 char(2),
iso_alpha3 char(3),
iso_numeric integer,
fips_code varchar(3),
name varchar(200),
capital varchar(200),
areainsqkm double,
population integer,
continent char(2),
tld char(3),
currency char(3),
currencyName char(20),
Phone char(10),
postalCodeFormat char(20),
postalCodeRegex char(20),
languages varchar(200),,
geonameId int
neighbours char(20),
equivalentFipsCode char(10)
) CHARACTER SET utf8;
LOAD DATA LOCAL INFILE 'tableData/countryInfo.txt' INTO TABLE countryInfo IGNORE 1 LINES (iso_alpha2,iso_alpha3,iso_numeric,fips_code,name,capital,areainsqkm,population,continent,tld,currency,currencyName,Phone,postalCodeFormat,postalCodeRegex,languages,geonameId,neighbours,equivalentFipsCode)
|
|
|
15/10/2012 06:38:39
|
gorth
Joined: 13/10/2012 05:19:52
Messages: 2
Offline
|
giorgio79 wrote:
Here is the structure for admin2codes
Code:
CREATE TABLE `geonames_admin2codes` (
`code` char(10) default NULL,
`name_local` text,
`name` text NOT NULL,
`geonameid` int(11) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
this is clearly WRONG, if you open admin2Codes.txt in vi or any editor
it is immediately evident that char(10) is too short for the 1st column.
Have you even looked at the data? and what is "int(11)"? From empirically
looking for longest lines in the file, it looks like 48 chars is the longest
field - row 16,561, "MF.00.3578419", 2nd column:
Arrondissement de Saint-Martin-Saint-Barthélemy
OTHER ISSUES:
- Madagaacar seems to have the longest codes, ie, MG.7670848.8299287
- the file seems to have corrupted rows for North Korea.
row 15,502: KP.15.YangdÅk-kun YangdÅk-kun Yangdok-kun 1866790
row 15,520: KP.15.Sukch’Ån-gun ..
row 15,540: KP.15.P’yÅngwÅn-gun ..
row 15,576: KP.09.Kimhwa-gun Kimhwa-gun Kimhwa-gun 1875980
there are a few (3-5) more rows like these clearly corrupted lines of text,
you can spot them immediately in a text editor.
Using type TEXT in the table schema [above] is overkill, adding unnecessary overhead (efficiency drain).
BETTER:
CREATE TABLE admin2codes
(
code char(24) default NULL,
name_local VARCHAR(64),
name VARCHAR(64),
geonameid INT NOT NULL
);
This geo dataset is new to me, please correct me if I am missing something,
but making a definitive claim "this is the schems" when it is so obviously
not is irresponsible.
Gorth (www.Vettrasoft.com)
|
|
|
26/10/2012 06:31:29
|
Backslider
Joined: 26/10/2012 02:22:56
Messages: 5
Offline
|
I want to create separate tables for: Country, State, City (worldwide)
Which data files should I use?
|
|
|
27/10/2012 15:59:45
|
Backslider
Joined: 26/10/2012 02:22:56
Messages: 5
Offline
|
- the file seems to have corrupted rows for North Korea.
row 15,502: KP.15.YangdÅk-kun YangdÅk-kun Yangdok-kun 1866790
row 15,520: KP.15.Sukch’Ån-gun ..
row 15,540: KP.15.P’yÅngwÅn-gun ..
row 15,576: KP.09.Kimhwa-gun Kimhwa-gun Kimhwa-gun 1875980
They are not corrupted... you must use UTF8
|
|
|
27/10/2012 15:59:46
|
Backslider
Joined: 26/10/2012 02:22:56
Messages: 5
Offline
|
Where can I find a list of data columns and an explanation of them?
|
|
|
01/02/2013 07:36:11
|
toniweb
Joined: 31/01/2013 13:20:05
Messages: 1
Offline
|
Links for admin1Codes.txt AND featureCodes.txt are broken,
Is there any newer guide?
|
|
|
13/04/2013 06:49:22
|
mcupryk
Joined: 22/02/2011 05:04:53
Messages: 4
Offline
|
Hello does anyone have a ms sql version of the tables and
if there is a way to import this into the tables. I would appreciate it.
|
|
|
25/01/2014 08:11:24
|
ajay138
Joined: 24/01/2014 18:54:12
Messages: 3
Offline
|
I have created a php script that creates the database tables and insert all the values ofl into the tables.
If You want the script then contact me..
My email ajay138@gmail.com
|
|
|
27/01/2014 23:03:48
|
ajay138
Joined: 24/01/2014 18:54:12
Messages: 3
Offline
|
I have created a fully automated script for this purpose. Instructions to download and use the scripts are here: http://blog.froiden.com/scripts/geonames-data-to-mysql-importer/
|
|
|
27/11/2015 15:52:57
|
scottseeker
Joined: 23/11/2015 17:54:57
Messages: 3
Offline
|
Using recently downloaded allCountries.txt file from GeoNames, I get the following error on import: ERROR 1300 (HY000): Invalid utf8 character string: ''Afikanisitani,'Apekanikana,A Phu Han (Afghanistan),A Phú Hãn '
I also get a UTF-8 error on alternatenames table.
Has anyone run across this and found a way to fix the 'bad characters'?
|
|
|
27/11/2015 15:52:57
|
scottseeker
Joined: 23/11/2015 17:54:57
Messages: 3
Offline
|
I kept getting invalid UTF-8 errors on inloading the .txt files while working under Windows 7.
I set up an Ubuntu VM and did it there. Everything worked fine.
Has anyone had any luck loading the .txt files in Windows?
|
|
|
|
|
|