GeoNames Home | Postal Codes | Download / Webservice | About 

GeoNames Forum
  [Search] Search   [Recent Topics] Recent Topics   [Groups] Back to home page 
[Register] Register / 
[Login] Login 
Proximity Searches (MySQL)  XML
Forum Index -> FAQ - frequently asked questions
Author Message
rcombis



Joined: 08/12/2007 19:50:53
Messages: 2
Offline

I have been working on get a proximity search working on the geonames data. Following a bunch of posts, I got it working. The query time ranges from very fast (10ms) to 7-10 seconds.

I am using MySQL. I have indexes on GeoNameID, name, Latitude and Longitude.

I am using a box approach.

My query looks like this:

Select geonameid, admin1_code, admin2_code, name, latitude, longitude ,3963.191 * ACOS((SIN(radians(?latA))*SIN(radians(latitude))) +(COS(radians(?latA))*cos(radians(latitude))*COS(radians(longitude)-radians(?longA)))) AS distance

FROM geonames where (Latitude >= ?latMinRange AND Latitude <= ?latMaxRange AND Longitude >= ?longMinRange AND Longitude <= ?longMaxRange)

AND 3963.191 * ACOS((SIN(radians(?latA))*SIN(radians(latitude))) + (COS(radians(?latA))*cos(radians(latitude))*COS(radians(longitude)-radians(?longA)))) <= 1.5 order by distance;

I am using the box approach with the latMinRange, latMaxRange, longMinRange, longMaxRange:

latRange = 1.5 (distance I am searching) / 69.0499
longRange = 1.5 / (69.0499 * COS(radians(?latA))

latMinRange = latA - latRange
latMaxRange = latA + latRange
longMinRange = longA - longRange
longMaxRange = longA + longRange

My question is how can I optimize this search query?

Some of my thoughts:

1. Investigate MySQL spatial data types and indexes (I have found limited docs on how to make this work with lat/long)

2. I have began trying subqueries, but they don't seem to make much difference. (I have tried just returning the box data itself without narrowing it down)

3. Looking at allowing MySQL to use more memory and caching more. (obviously after the query caches it returns much faster)

What else can I try? (There are approx 2 million rows in geonames table)

My dev server is 2.0GHZ P4 with 512MB running windows 2000 server.

Thanks,
-Rob
rcombis



Joined: 08/12/2007 19:50:53
Messages: 2
Offline

Here is what I did:

Create a column in the main geonames table called gPoint with POINT as the data Type. Allow nulls at first.

Populate the field like so:
UPDATE geonames3
SET gPoint= PointFromText(CONCAT('POINT(',longitude,' ',latitude,')'));

Set the column to not allow nulls. (Index won't be created if you don't do this)

Create a spatial index on the gPoint column. (This takes a long time, I left the office after it ran for 30 minutes and it was still going)

The where statement becomes:
MBRContains(GeomFromText('POLYGON((" + dMinLong.ToString + " " + dMinLat.ToString + ", " + dMaxLong.ToString + " " + dMinLat.ToString + ", " + dMaxLong.ToString + " " + dMaxLat.ToString + "," + dMinLong.ToString + " " + dMaxLat.ToString + "," + dMinLong.ToString + " " + dMinLat.ToString + "))'),gPoint)

The searches are now lightening fast with caching turned off.

My next step will be to add in the precision search to narrow down the results.

I may put the distance calculation in a MySQL function.

Here are some good links:
http://groups.google.com/group/kml-support/web/getting-started-with-spatial-mysql

http://www.webdev.cygad.net/docs/556/docs/mysql/manual/en/manual_Spatial_extensions_in_MySQL.html#Relations_on_geometry_MBR

I am using MySQL version 5.0.45 community edition.

-Rob

marc



Joined: 08/12/2005 07:39:47
Messages: 4439
Offline

Thanks Rob, for sharing your experience.

Cheers

Marc

[WWW]
bytte



Joined: 07/01/2008 17:56:11
Messages: 15
Offline

Thanks a lot rcombis. If you have any more information/experience on this matter, would be great if you could keep us posted.
GISWannabe



Joined: 02/02/2008 16:25:04
Messages: 1
Offline

this is brilliant folks, exactly what I was looking for. I had come to the same conclusion using a method similar to yours Rob and have been experimenting with GIS.

I'm getting ahead of myself a bit but could we specify an area made up of lots of points to define regions such as states or counties and then we can say something along the lines of WHERE 'point' IN 'region'?

actualrandy



Joined: 17/05/2007 01:48:24
Messages: 7
Offline

I have been using the Web service and now I'm switching to doing my own reverse geocoding, mostly for speed. I have an issue to discuss that might have a natural explanation but also might represent some bad data.

Specifically, I downloaded the file AllCountries.txt and noticed some duplicate coordinates

There are a lot of entries with duplicate lat/long pairs. Some of these end up in different political entries, such as (to pick an arbitrary example) "Thacker Slough", which is listed twice, once in Nevada and once in Idaho.

Other entries have duplicate lat/long but differing names, such as Muret and Cirstanesti, each sharing the coordinates 45.05, 24.05

I realize there could be very good reasons why the same lat/long appears twice, but I wanted to check to ensure that there aren't any errors in the data before I start my own reverse geocoding.
marc



Joined: 08/12/2005 07:39:47
Messages: 4439
Offline

The same coordinate for two different places is normally a rounding problem. Some data sources round their coordinate to 30''

Marc

[WWW]
chris_LE



Joined: 26/03/2008 14:16:16
Messages: 1
Location: Leipzig/Germany
Offline

rcombis wrote:
Here is what I did:

Create a column in the main geonames table called gPoint with POINT as the data Type. Allow nulls at first.

Populate the field like so:
UPDATE geonames3
SET gPoint= PointFromText(CONCAT('POINT(',longitude,' ',latitude,')'));

Set the column to not allow nulls. (Index won't be created if you don't do this)

Create a spatial index on the gPoint column. (This takes a long time, I left the office after it ran for 30 minutes and it was still going)

The where statement becomes:
MBRContains(GeomFromText('POLYGON((" + dMinLong.ToString + " " + dMinLat.ToString + ", " + dMaxLong.ToString + " " + dMinLat.ToString + ", " + dMaxLong.ToString + " " + dMaxLat.ToString + "," + dMinLong.ToString + " " + dMaxLat.ToString + "," + dMinLong.ToString + " " + dMinLat.ToString + ")'),gPoint)

The searches are now lightening fast with caching turned off.

My next step will be to add in the precision search to narrow down the results.

I may put the distance calculation in a MySQL function.

Here are some good links:
http://groups.google.com/group/kml-support/web/getting-started-with-spatial-mysql

http://www.webdev.cygad.net/docs/556/docs/mysql/manual/en/manual_Spatial_extensions_in_MySQL.html#Relations_on_geometry_MBR

I am using MySQL version 5.0.45 community edition.

-Rob

 



Correct me if I'm wrong, but I must *STRONGLY* recommend not to do this as the MYSQL spatial implementation is not compatibable with a geo-centric dataset:

In MySQL, the SRID value is just an integer associated with the geometry value. All calculations are done assuming Euclidean (planar) geometry. ...

...A Polygon is a planar Surface representing a multisided geometry. It is defined by a single exterior boundary and zero or more interior boundaries, where each interior boundary defines a hole in the Polygon.
 


see:

http://forums.mysql.com/read.php?23,40909,43079

AND

http://dev.mysql.com/doc/mysql/search.php?version=5.0&q=planar&from=%2Fdoc%2Frefman%2F5.0%2Fen%2Fgeneral-geometry-property-functions.html&lang=en

I have tested the spatial extension myself with the data provided by geonames-db and the result was the following:

While calculations were indeed "lightning fast" using spatial indexes the calculation was off by around 4km in short range(<100km) distances and off by around 300-400km in long-range distances (>5000km).

I therfore can only recommend to read the MySQL-documentation again, and to use the spheric cos/sin calculations (great circle distance formula) or to use another RDBMS

But that's just my two cents

CVH

giorgio79



Joined: 21/04/2008 17:05:33
Messages: 28
Offline

If you wondered how to add a POINT column, here it is

http://dev.mysql.com/doc/refman/5.0/en/creating-spatial-columns.html
giorgio79



Joined: 21/04/2008 17:05:33
Messages: 28
Offline

When I try and copy the original example query into phpmyadmin sql, I am getting errors

Code:
 Select geonameid, admin1_code, admin2_code, name, latitude, longitude ,3963.191 * ACOS((SIN(radians(?latA))*SIN(radians(latitude))) +(COS(radians(?latA))*cos(radians(latitude))*COS(radians(longitude)-radians(?longA)))) AS distance
 
 FROM geoname where (Latitude >= ?latMinRange AND Latitude <= ?latMaxRange AND Longitude >= ?longMinRange AND Longitude <= ?longMaxRange)
 
 AND 3963.191 * ACOS((SIN(radians(?latA))*SIN(radians(latitude))) + (COS(radians(?latA))*cos(radians(latitude))*COS(radians(longitude)-radians(?longA)))) <= 1.5 order by distance


Anyone could tell me what is wrong here?

PS: I realized the ? values are variables which I have to substitute
vadimberman



Joined: 01/07/2008 02:51:43
Messages: 3
Location: Melbourne, Australia
Offline

Works like charm, but a warning already voiced: use it only if you do searches within a neighbourhood, or a suburb, or a city.

Anything bigger than that already doesn't suit the assumption that the earth is flat .
[WWW]
giorgio79



Joined: 21/04/2008 17:05:33
Messages: 28
Offline

oh jeah baby
http://www.mysqlfulltextsearch.com/geo_search.pdf
chipster



Joined: 28/11/2008 15:31:56
Messages: 2
Offline

Hello everyone,

i try de resolve this problem : get the nearest loc ( or with scope ) in meter from a point with LAT / LONG.

I tried to follow your example ( Gpoint, spatial index,...) but it dones nothing ( seems distance procedure doesnt work...). Here is what i have :


SQL QUERY :

SET @g = PointFromText(CONCAT('POINT(',47.8500000,' ',1.7500000,')'));

select distance(gPoint,@g) as scope, asciiname
from geonames
HAVING scope < 5000
LIMIT 0 , 1

FUNCTION :

CREATE FUNCTION distance(a POINT, b POINT) RETURNS double
DETERMINISTIC
RETURN ( ROUND(GLength(LineStringFromWKB(LineString(AsBinary(a),
AsBinary(b))))) )
chipster



Joined: 28/11/2008 15:31:56
Messages: 2
Offline

here is what it works for me ( distance in KM ) :

SELECT (
(
ACOS( SIN( 47.85 * PI( ) /180 ) * SIN( latitude * PI( ) /180 ) + COS( 47.85 * PI( ) /180 ) * COS( latitude * PI( ) /180 ) * COS( (
1.75 - longitude
) * PI( ) /180 ) ) *180 / PI( )
) *60 *1.75
) AS distance, asciiname
FROM geonames
HAVING distance <= '2'
ORDER BY distance ASC
LIMIT 0 , 1

-----------------------------------
Problem is that it takes...23.6905 sec...i can do way better with parsing files by prog
scaredOfSql



Joined: 20/07/2016 06:33:05
Messages: 1
Offline

I know this is an old topic, but can someone tell me how the value 1.75 is derived in the last part of the equation (*60*1.75). Originally I thought it was the longitude value, but after reading http://stackoverflow.com/a/389251/691053 I think the value should probably be 1.851999999962112 (1.1507794480*1.609344=1.851999999962112).
thuha123



Joined: 12/08/2016 11:29:58
Messages: 1
Offline

I know this is an old topic, but can someone tell me how the value 1.75 is derived in the last part of the equation (*60*1.75). Originally I thought it was the longitude value, but after reading http://stackoverflow.com/a/389251/691053 I think the value should probably be 1.851999999962112 (1.1507794480*1.609344=1.851999999962112).
[WWW]
 
Forum Index -> FAQ - frequently asked questions
Go to:   
Powered by JForum 2.1.5 © JForum Team