GeoNames Home | Postal Codes | Download / Webservice | About 

GeoNames Forum
  [Search] Search   [Recent Topics] Recent Topics   [Groups] Back to home page 
[Register] Register / 
[Login] Login 
Possible incorrect entry in alternateNamesV2  XML
Forum Index -> General
Author Message
pablolbap



Joined: 02/07/2019 16:48:53
Messages: 1
Offline

Hi,

My file reader detects a wrong number of tokens when scanning the entry for the Istanbul International Airport (from alternateNamesV2).
I believe it could be due to it having one separator to many.

said entry:
"13635172;11838481;iata;ISL;;;;1;2018-10-31;2019-04-06 ;"

Best regards
SvenAtWork



Joined: 26/07/2019 12:03:22
Messages: 2
Location: Germany
Offline

Same here.
There is an additional Tab at the end of the line.

May be other lines aswell, but was not able to analyse it yet.

Would be much appreciated, if this can be fixed soon.
Otherwise we have to build a custom programmed solution.

However, because this is my first post in this forum...
GREAT WORK in general!
I use geonames.org data (Gazatteer + PostalCodes) for years.
marc



Joined: 08/12/2005 07:39:47
Messages: 4412
Offline

you are right there are some control chars in the from and to fields.
Will be fixed with the next extract.
And the frontend will be improved to eliminated these chars when saving.

Marc

[WWW]
SvenAtWork



Joined: 26/07/2019 12:03:22
Messages: 2
Location: Germany
Offline

Thanks a lot!

just for info:
I parsed the whole file today, seems that the line with alternateNameId "13635172" is the only problematic row in the file.
willi99



Joined: 05/08/2019 12:00:14
Messages: 4
Offline

I have downloaded allCoutries.zip and parsed it via python csv.Dictreader, and in names2 (the big field with alle the different utf encodes) I encountered control chars too, as it always stopped at one of the Afghanistan entries.
[Thumb - WhatsApp Image 2019-08-03 at 15.11.17.jpeg]
 Description [Disk] Download
 Filesize 17 Kbytes
 Downloaded:  2347 time(s)

marc



Joined: 08/12/2005 07:39:47
Messages: 4412
Offline

the alternateNameId "13635172" should be fixed. Is there still an issue?

What is the problem with Afghanistan? Which feature and which control char?

Marc

[WWW]
willi99



Joined: 05/08/2019 12:00:14
Messages: 4
Offline

Hy, the Error Message (python) on ID 1149361
is:

_mysql_exceptions.OperationalError: (1366, "Incorrect string value: '\\xF0\\x90\\x8C\\xB0\\xF0\\x90...' for column 'alternatenames' at row 1")

is it possible there is an escape code to change write direction to right-to-left is in there, as nano behaves strange with this field too, when passing this character nano display gets garbaged.


willi99



Joined: 05/08/2019 12:00:14
Messages: 4
Offline

i suspect its 4byte utf stings that trigger the error. but strange that nano had display problems too

https://stackoverflow.com/questions/10957238/incorrect-string-value-when-trying-to-insert-utf-8-into-mysql-via-jdbc#
willi99



Joined: 05/08/2019 12:00:14
Messages: 4
Offline

It was my error, i had to use utf8mb5 encoding, also for the python mysqldb-connector. it worked now. its because 4byte utf characters are used and mysql does not handle them in utf8 but in utf8mb4.
mariakatosvich



Joined: 24/09/2016 13:38:52
Messages: 1
Offline

If you have MySQL 5.5 or later you can change the column encoding from utf8 to utf8mb4. This encoding allows storage of characters that occupy 4 bytes in UTF-8.

You may also have to set the server property character_set_server to utf8mb4 in the MySQL configuration file
[WWW]
 
Forum Index -> General
Go to:   
Powered by JForum 2.1.5 © JForum Team