GeoNames Home | Postal Codes | Download / Webservice | About 

GeoNames Forum
  [Search] Search   [Recent Topics] Recent Topics   [Groups] Back to home page 
[Register] Register / 
[Login] Login 
Import GeoNames dump into SQL Server  XML
Forum Index -> FAQ - frequently asked questions
Author Message
marc



Joined: 08/12/2005 07:39:47
Messages: 4439
Offline

Johannes Beck has written a how to :
http://johanneskebeck.spaces.live.com/Blog/cns!42E1F70205EC8A96!3782.entry

New 15. Jan 2009: load the GeoNames locations into SQL Server Spatial
http://blogs.msdn.com/edkatibah/archive/2009/01/13/loading-geonames-data-into-sql-server-2008-yet-another-way.aspx

There are also some threads in the GeoNames forum :
http://forum.geonames.org/gforum/posts/list/817.page
http://forum.geonames.org/gforum/posts/list/88.page
http://forum.geonames.org/gforum/posts/list/673.page

Marc

[WWW]
sateesh



Joined: 02/09/2008 11:31:11
Messages: 5
Offline

Hi, i am new to geonames dump.

But i would like to import all the city names in the world which has population greater than 15000.


Can you please help me, how can i use that dump, and how can i import that into our local database.

Waiting for response....

Thanks,
Sateesh.
bemall



Joined: 12/11/2008 20:34:31
Messages: 11
Offline

I am a little confused if you could please clarify..
Per John Becks instructions
"Johannes Beck has written a how to :
http://johanneskebeck.spaces.live.com/Blog/cns!42E1F70205EC8A96!3782.entry"

The above uses the allcountry.txt input file which does not match the geonomes table that he is using. There is a clear mismatch in the number of fields for example the modified date is lacking. The geonomes table the he is using should be for a differnet input file. Let me know if you see otherwise ..
marc



Joined: 08/12/2005 07:39:47
Messages: 4439
Offline

The modified date is there called 'moddate'.

Marc

[WWW]
bemall



Joined: 12/11/2008 20:34:31
Messages: 11
Offline

I was just using the modified date as an example. There is a general mismatch. The allcountries.txt input file contains 11 columns while the create genome table contains 19 columns...
bemall



Joined: 12/11/2008 20:34:31
Messages: 11
Offline

Respectfullhy, has anyone verified these links or is the input file newer and simply does not match the old postings ?
marc



Joined: 08/12/2005 07:39:47
Messages: 4439
Offline

Oh, I see you are trying to load the postal code dump into the toponym table. This does not work, but it is a piece of cake to modify and adapt the scripts, isn't it?
There are two dumps, you have to decide which one you want.

Best

Marc

[WWW]
bemall



Joined: 12/11/2008 20:34:31
Messages: 11
Offline

No I am not doing anything with postal codes. I am simply trying to load allCountries.txt.

None of the links or explanations seem to work due to the difference in number of columns
bemall



Joined: 12/11/2008 20:34:31
Messages: 11
Offline

"I see you are trying to load the postal code dump into the toponym table"

I am not sure what gives you this impression becasue I do mentin that I am trying to load AllCountries.txt and the columns do not match those shown in the create table script
camerony



Joined: 03/12/2008 19:50:05
Messages: 1
Offline

Here are the steps I used to import GeoNames, PostalCodes, and Alternate Names into SQL 2005:

Used the utility on http://forum.geonames.org/gforum/posts/list/817.page to convert the files to UTF-16.

Utf8ToUtf16.exe allCountries.txt allCountriesUTF16.txt
Utf8ToUtf16.exe alternateNames.txt alternateNamesUTF16.txt
Utf8ToUtf16.exe allPostal.txt allPostalUTF16.txt (*I renamed the file to allPostal from allCountries.txt)

Next I created the tables. I took the liberty of specifying nvchar for Unicoded fields and vchar for Latin based fields. Also I increased the [Place Name] field in the Postal Codes to 200 length to match the [Name] field in GeoNames, not sure why there is a mismatch.

Here is the code I used to create the tables.
Code:
 
 
 /****** GeoNames ******/
 
 CREATE TABLE [dbo].[GeoNames_tbl](
 	[lGeoNameID] [int] PRIMARY KEY,
 	[sName] [nvarchar](200),
 	[sAsciiName] [nvarchar](200),
 	[sAlternateNames] [nvarChar](4000),
 	[fLatitude] [float],
 	[fLongitude] [float],
 	[sFeatureClass] [varchar](1),
 	[sFeatureCode] [varchar](10),
 	[sCountryCode] [varchar](2),
 	[sAlternateCountryCodes] [varchar](60),
 	[sAdmin1Code] [varchar](20),
 	[sAdmin2Code] [varchar](80),
 	[sAdmin3Code] [varchar](20),
 	[sAdmin4Code] [varchar](20),
 	[lPopulation] [bigint],
 	[sElevation] [bigint],
 	[sGtopo30] [bigint],
 	[sTimezone] [varchar](50),
 	[dtModificationDate] [datetime] NULL
 )
 
 
 /****** Alternate Names ******/
 
 CREATE TABLE [dbo].[GeoAlternateNames_tbl](
 	[lAlternateNameID] [int] PRIMARY KEY,
 	[lGeoNameID] [int],
 	[sISOLanguage] [nvarchar](7),
 	[sAlternateName] [nvarchar](200),
 	[sIsPreferredName] [char](1),
 	[sIsShortName] [char](1) 
 )
 
 
 /****** Postal Codes ******/
 
 CREATE TABLE [dbo].[GeoPostal_tbl](
 	[sCountryCode] [varchar](2),
 	[sPostalCode] [varchar](10),
 	[sPlaceName] [nvarchar](200),
 	[sAdminName1] [nvarchar](100),
 	[sAdminCode1] [varchar](20),
 	[sAdminName2] [nvarchar](100),
 	[sAdminCode2] [varchar](20),
 	[sAdminName3] [nvarchar](100),
 	[fLatitude] [float],
 	[fLongitude] [float],
 	[iAccuracy] [smallint] NULL
 )
 
 


Using "SQL Server Import and Export Wizard" do the following.
Specify Flat File Source
Choose the source file. i.e. allCountriesUTF16.txt (You should see that Unicode is automatically checked, if the UTF8 to UTF 16 conversion was successful)
In the Columns section can the Column Delimiter to Tab and click Refresh (You should see that all the columns look correct without any strange 'tab' characters)
In the Advanced section (the most important) you need to specify the specifications for each field.
Use Suggest Types... option and change the number of rows to 1000 and click OK. (this is an optional step but it'll save some time)
Next use the table specifications to define each column. For any integer field, I specified "eight-byte signed integer" . Also "Column 10" and the 11th field in GeoNames you need to specify Unicode String and length 20 as per the specifications, since "Suggest Types" defines it as a single-byte integer.
Once you have done all this, you can specify the destination, make sure you specify the destination table as the one you created since it will default to a new table using the file name.
Hopefully, the data import will finish successfully. Otherwise you may have to adjust you source column specifications (I missed some fields the first time I did this).

Good luck, I hope this helps


P.S. I'd have these pages open while you do the column specifications.
Link to GeoNames/Alternate Names specifications
http://download.geonames.org/export/dump/
Link to PostCodes specifications
http://download.geonames.org/export/zip/
rainmanjam



Joined: 15/01/2008 01:52:36
Messages: 2
Offline

I am trying to do this in SQL 2008. Is there anything differently I should be doing. I have been having issues.
shelto



Joined: 19/01/2009 03:54:24
Messages: 1
Offline

The program converted fine but I still had issues with the Import wizard. I finally go this to work using this table script and the BULK INSERT command::

CREATE TABLE GeoNames(
geonameid int NOT NULL,
name nvarchar(200) NULL,
asciiname nvarchar(200) NULL,
alternatenames nvarchar(max) NULL,
latitude float NULL,
longitude float NULL,
feature_class char(2) NULL,
feature_code nvarchar(10) NULL,
country_code char(3) NULL,
cc2 char(60) NULL,
admin1_code nvarchar(20) NULL,
admin2_code nvarchar(80) NULL,
admin3_code nvarchar(20) NULL,
admin4_code nvarchar(20) NULL,
population bigint NULL,
elevation int NULL,
gtopo30 int NULL,
timezone char(31) NULL,
modification_date date NULL
)
GO



BULK
INSERT GeoNames
FROM 'C:\temp\outputFile.txt'
WITH(
DATAFILETYPE = 'widechar',
FIELDTERMINATOR = '\t',
ROWTERMINATOR = '\n'
)
GO


Hope this helps!!
rickrat



Joined: 23/10/2010 19:55:54
Messages: 2
Offline

I'd like to post an update to this issue. You can use the
Code:
 BULK 
 INSERT GeoNames 
 FROM 'C:\temp\outputFile.txt' 
 WITH( 
 DATAFILETYPE = 'widechar', 
 FIELDTERMINATOR = '\t', 
 ROWTERMINATOR = '\n' 
 ) 
 GO 
 

to import the GeoNames and the GeoAlternateNames tables, but... to import the GeoPostal you have to use the Import Data...

Also, I modified the table structure as the geopostal table was missing a field, and the lat/lon need to be decimal(9, 6).

Code:
 /****** GeoNames ******/
  
  CREATE TABLE [dbo].[GeoNames](
  	[GeoNameID] [int] PRIMARY KEY,
  	[Name] [nvarchar](200),
  	[AsciiName] [nvarchar](200),
  	[AlternateNames] [nvarChar](4000),
 	[Latitude] [decimal](9, 6) NOT NULL
 	[Longitude] [decimal](9, 6) NOT NULL,
  	[FeatureClass] [varchar](1),
  	[FeatureCode] [varchar](10),
  	[CountryCode] [varchar](2),
  	[AlternateCountryCodes] [varchar](60),
  	[Admin1Code] [varchar](20),
  	[Admin2Code] [varchar](80),
  	[Admin3Code] [varchar](20),
  	[Admin4Code] [varchar](20),
  	[Population] [bigint],
  	[Elevation] [bigint],
  	[Gtopo30] [bigint],
  	[Timezone] [varchar](50),
  	[ModificationDate] [datetime] NULL
  )
  
  
  /****** Alternate Names ******/
  
  CREATE TABLE [dbo].[GeoAlternateNames](
  	[AlternateNameID] [int] PRIMARY KEY,
  	[GeoNameID] [int],
  	[ISOLanguage] [nvarchar](7),
  	[AlternateName] [nvarchar](200),
  	[IsPreferredName] [char](1),
  	[IsShortName] [char](1) 
  )
  
  
  /****** Postal Codes ******/
  
  CREATE TABLE [dbo].[GeoPostal](
  	[CountryCode] [varchar](2),
  	[PostalCode] [varchar](10),
  	[PlaceName] [nvarchar](200),
  	[AdminName1] [nvarchar](100),
  	[AdminCode1] [varchar](20),
  	[AdminName2] [nvarchar](100),
  	[AdminCode2] [varchar](20),
  	[AdminName3] [nvarchar](100),
  	[AdminCode3] [varchar](20),
 	[Latitude] [decimal](9, 6) NOT NULL
 	[Longitude] [decimal](9, 6) NOT NULL,
  	[Accuracy] [smallint] NULL
  )
 
rickrat



Joined: 23/10/2010 19:55:54
Messages: 2
Offline

I've updated the accuracy of the US zip codes, to be the exact north/south point (to 6 decimal places) on the globe of the centroid or "middle" of the US Postal Code.

This is a sql script for sql server 2005-2008.

http://www.compdj.com/dl.aspx?f=Temp/UpdateGeoPostalUSAccuracy.zip

Comes with no warranties, etc.

Enjoy!

Rick
shigarr



Joined: 05/06/2012 13:07:27
Messages: 1
Offline

If anyone of you guys ever come across the following SQL Server error while using BULK INSERT to load data into tables:

"Bulk load: DataFileType was incorrectly specified as widechar. DataFileType will be assumed to be char because the data file does not have a Unicode signature."

and the steps available to covert a UTF-8 file type to UTF-16 file type as listed in one of the blogs posted by mark right in the begining weren't helpful enough to resolve the issue THEN following blog might be useful to you.

http://aliiraza.wordpress.com/?p=103&preview=true

few simple steps listed to convert UTF-8 to UTF-16 using SQL Server Management Studio.

hope it helps!
antiochIst



Joined: 02/12/2012 04:13:01
Messages: 1
Offline

getting Geo locations into db is a little tricky here is a great tutorial:

http://taylor.woodstitch.com/php/php-mysql-best-solutions-for-finding-points-in-a-polygon-from-a-database/
ethan1701



Joined: 02/02/2017 10:18:50
Messages: 1
Offline

As of SQL 2014, UTF-8 is supported, and it's no longer needed to convert to UTF-16.
I successfully imported the file via a bulk insert task in SSIS

-Ethan
jonilviv



Joined: 20/11/2017 14:00:58
Messages: 1
Offline

Hi, everyone!
I just created project, that helps automaticaly import TXT files to MS SQL Server. https://github.com/jonilviv/GeoNames.org
 
Forum Index -> FAQ - frequently asked questions
Go to:   
Powered by JForum 2.1.5 © JForum Team