GeoNames Home | Postal Codes | Download / Webservice | About 

GeoNames Forum
  [Search] Search   [Recent Topics] Recent Topics   [Groups] Back to home page 
[Register] Register / 
[Login] Login 
Importing dumps into MySQL  XML
Forum Index -> General
Author Message
Anonymous



hello, i have been trying to import your database dumps into MySQL.
i succeeded in importing NL.txt using phpmyadmin and Navicat.
but when i try to import allCountries.txt it generates 1406 errors:

Code:
[Err] 1406 - Data too long for column 'name' at row 1


i find names like: Ragwān, Kharīmat Qaşīmah and Wādī Misaith
could this be the problem?
marc



Joined: 08/12/2005 07:39:47
Messages: 4486
Offline

How long is your column 'name' ?

The dump files are in utf8 format, this may explain why you see strange names like the ones you mention.
You have to configure your database to use utf8. In the NL file there are only dutch names and therefore no exotic characters.


Marc

[WWW]
Anonymous



i can choose between:
utf8_bin
utf8_czech_ci
utf8_danish_ci
utf8_esperanto_ci
(...)
utf8_turkish_ci
utf8_unicode_ci

what is best to use? unicode or binary?
marc



Joined: 08/12/2005 07:39:47
Messages: 4486
Offline

Hi Anonymous

What you are referring to are "collations" used for sorting.

http://dev.mysql.com/doc/refman/5.0/en/charset-unicode.html

According to this page :
http://dev.mysql.com/doc/refman/5.0/en/charset-charsets.html

you could use the collation : "utf8_general_ci" for the character set "utf8".

Hope this helps.

Marc

[WWW]
Anonymous



thanks i got it working now
now its just waiting waiting and waiting before it completes
luftikus143



Joined: 10/05/2006 09:46:59
Messages: 9
Offline

Gush, difficult thing with these special characters, like é, à, ü...

At least in my case MySQL does not sort the names correctly. So Zürich is not listed below Zug, but above Zeglingen.

I use UTF-8. Any explanatations?
marc



Joined: 08/12/2005 07:39:47
Messages: 4486
Offline

Hi luftikus143

Yes it is an effect of UTF8. It is the DIN1 sorting and you want the DIN2 sorting.
The mysql doc says :
For example, the following equalities hold in both utf8_general_ci and utf8_unicode_ci:

Ä = A
Ö = O
Ü = U 

from http://dev.mysql.com/doc/refman/5.0/en/charset-unicode-sets.html

I am not using mysql but it seems you have to use the latin charset to get the sorting you want.
The latin1_german1_ci and latin1_german2_ci collations are based on the DIN-1 and DIN-2 standards, where DIN stands for Deutsches Institut für Normung (the German equivalent of ANSI). DIN-1 is called the “dictionary collation” and DIN-2 is called the “phone book collation.”

latin1_german1_ci (dictionary) rules:

Ä = A
Ö = O
Ü = U
ß = s

latin1_german2_ci (phone-book) rules:

Ä = AE
Ö = OE
Ü = UE
ß = ss 



Links :
http://dev.mysql.com/doc/refman/5.0/en/charset-unicode-sets.html
http://dev.mysql.com/doc/refman/5.0/en/charset-we-sets.html
http://dev.mysql.com/doc/refman/5.0/en/charset-collation-effect.html

Hope this helps

Marc

[WWW]
luftikus143



Joined: 10/05/2006 09:46:59
Messages: 9
Offline

Thanks for these info.

It's actually interesting to see, that with the same software (Navicat), the import into MySQL is running smoothly without any errors being reported; but the import into Postgres results in a couple of errors. I took the CH.txt file.

Import log (2006-05-10 10:57:46)
Row14 - ERROR: could not convert UTF8 character 0x00e0 to ISO8859-1
Row552 - ERROR: could not convert UTF8 character 0x00e1 to ISO8859-1
Row2112 - ERROR: could not convert UTF8 character 0x00e0 to ISO8859-1
Row2764 - ERROR: could not convert UTF8 character 0x00e0 to ISO8859-1
Row3670 - ERROR: could not convert UTF8 character 0x00e0 to ISO8859-1

So, this is true for an LATIN1 database.

But a UTF-8 database (Postgres) works fine. But it does not sort the name (with the special characters) correctly.
lonelycastle



Joined: 05/07/2007 20:10:33
Messages: 1
Offline

I'm having a problem a lot of people have had with the unusual characters, and so far I haven't been able to find a solution.

I'll first say I use phpMyAdmin, but I've also been using the Putty database monitor thing for larger operations which time out in phpMyAdmin (I'm new to all of this so if my terminology is wrong, I apologize).

The first time I tried to import the database, I simply created the table with phpMyAdmin setting utf8 as the character set for the table, but assigning the following column character sets: name: latin1_swedish_ci, ansiname: ascii_general_ci, alternatenames: utf8_unicode_ci. Then I used Putty and the following expession thing to import the allCountries.txt file:

load data infile '/home/mymapps/domains/mymapps.com/public_html/admin/allCountries.txt' INTO TABLE maps;

As may be predicted, my character set thingys for the columns weren't right and I had some unusual/incorrect characters such as Kita Seamount,北海山 in all the name columns (though the errors were least frequent in the ansiname column).

I then changed the character sets to name: utf8_general_ci, ansiname: ascii_general_ci, alternatenames: utf8_general_ci, but got similarly bad results.

I then did as is stated above:

CREATE TABLE `maps` (
`geonameid` int(10) unsigned NOT NULL default '0',
`name` varchar(200) NOT NULL default '',
`ansiname` varchar(200) NOT NULL default '',
`alternames` varchar(200) NOT NULL default '',
`latitude` double NOT NULL default '0',
`longitude` double NOT NULL default '0',
`feature_class` char(1) ,
`feature_code` varchar(10) ,
`country_code` char(2),
`cc2` varchar(60),
`admin1_code` varchar(20) default '',
`admin2_code` varchar(80) default '',
`population` int(11) default '0',
`elevation` int(11) default '0',
`gtopo30` int(11) default '0',
`timezone` varchar(40),
`modification_date` date default '0000-00-00',
PRIMARY KEY (`geonameid`)
) CHARACTER SET utf8 ;

And imported the allCountries.tx file in the same manner as before, but with the same bad results (name:Zélée Bank, alternatenames:Banc de la Zelee,Banc de la Zélée).

I should say I was looking at the rows in phpMyAdmin, which might not display them right or something, but I tried to SELECT the rows in Putty but had the same result.

Any advice as to how I can get the file imported with proper characters would be very much appreciated.

*********************

Never mind, I figured it out. I had the table using utf8_general_ci character sets, but I forgot to make the database utf8.
bytte



Joined: 07/01/2008 17:56:11
Messages: 15
Offline

I'm trying to import the admin1Codes.txt file into MySql. My table is set to a default character set of utf-8, with a default collation of utf8_general_ci. So is the database itself.

However, when I try to import the file using both Navicat and CocoaMySql, I have problem with the special characters. AE.01 is Ab? Z?aby and so on.

What else do I have to look out for when importing?
 
Forum Index -> General
Go to:   
Powered by JForum 2.1.5 © JForum Team