Author |
Message |
08/12/2007 20:14:04
|
rcombis
Joined: 08/12/2007 19:50:53
Messages: 2
Offline
|
I have been working on get a proximity search working on the geonames data. Following a bunch of posts, I got it working. The query time ranges from very fast (10ms) to 7-10 seconds.
I am using MySQL. I have indexes on GeoNameID, name, Latitude and Longitude.
I am using a box approach.
My query looks like this:
Select geonameid, admin1_code, admin2_code, name, latitude, longitude ,3963.191 * ACOS((SIN(radians(?latA))*SIN(radians(latitude))) +(COS(radians(?latA))*cos(radians(latitude))*COS(radians(longitude)-radians(?longA)))) AS distance
FROM geonames where (Latitude >= ?latMinRange AND Latitude <= ?latMaxRange AND Longitude >= ?longMinRange AND Longitude <= ?longMaxRange)
AND 3963.191 * ACOS((SIN(radians(?latA))*SIN(radians(latitude))) + (COS(radians(?latA))*cos(radians(latitude))*COS(radians(longitude)-radians(?longA)))) <= 1.5 order by distance;
I am using the box approach with the latMinRange, latMaxRange, longMinRange, longMaxRange:
latRange = 1.5 (distance I am searching) / 69.0499
longRange = 1.5 / (69.0499 * COS(radians(?latA))
latMinRange = latA - latRange
latMaxRange = latA + latRange
longMinRange = longA - longRange
longMaxRange = longA + longRange
My question is how can I optimize this search query?
Some of my thoughts:
1. Investigate MySQL spatial data types and indexes (I have found limited docs on how to make this work with lat/long)
2. I have began trying subqueries, but they don't seem to make much difference. (I have tried just returning the box data itself without narrowing it down)
3. Looking at allowing MySQL to use more memory and caching more. (obviously after the query caches it returns much faster)
What else can I try? (There are approx 2 million rows in geonames table)
My dev server is 2.0GHZ P4 with 512MB running windows 2000 server.
Thanks,
-Rob
|
|
|
09/12/2007 20:30:44
|
rcombis
Joined: 08/12/2007 19:50:53
Messages: 2
Offline
|
Here is what I did:
Create a column in the main geonames table called gPoint with POINT as the data Type. Allow nulls at first.
Populate the field like so:
UPDATE geonames3
SET gPoint= PointFromText(CONCAT('POINT(',longitude,' ',latitude,')'));
Set the column to not allow nulls. (Index won't be created if you don't do this)
Create a spatial index on the gPoint column. (This takes a long time, I left the office after it ran for 30 minutes and it was still going)
The where statement becomes:
MBRContains(GeomFromText('POLYGON((" + dMinLong.ToString + " " + dMinLat.ToString + ", " + dMaxLong.ToString + " " + dMinLat.ToString + ", " + dMaxLong.ToString + " " + dMaxLat.ToString + "," + dMinLong.ToString + " " + dMaxLat.ToString + "," + dMinLong.ToString + " " + dMinLat.ToString + "))'),gPoint)
The searches are now lightening fast with caching turned off.
My next step will be to add in the precision search to narrow down the results.
I may put the distance calculation in a MySQL function.
Here are some good links:
http://groups.google.com/group/kml-support/web/getting-started-with-spatial-mysql
http://www.webdev.cygad.net/docs/556/docs/mysql/manual/en/manual_Spatial_extensions_in_MySQL.html#Relations_on_geometry_MBR
I am using MySQL version 5.0.45 community edition.
-Rob
|
|
|
11/12/2007 13:18:20
|
marc
Joined: 08/12/2005 07:39:47
Messages: 4439
Offline
|
Thanks Rob, for sharing your experience.
Cheers
Marc
|
|
|
|
07/01/2008 17:59:35
|
bytte
Joined: 07/01/2008 17:56:11
Messages: 15
Offline
|
Thanks a lot rcombis. If you have any more information/experience on this matter, would be great if you could keep us posted.
|
|
|
02/02/2008 16:29:16
|
GISWannabe
Joined: 02/02/2008 16:25:04
Messages: 1
Offline
|
this is brilliant folks, exactly what I was looking for. I had come to the same conclusion using a method similar to yours Rob and have been experimenting with GIS.
I'm getting ahead of myself a bit but could we specify an area made up of lots of points to define regions such as states or counties and then we can say something along the lines of WHERE 'point' IN 'region'?
|
|
|
28/02/2008 23:43:40
|
actualrandy
Joined: 17/05/2007 01:48:24
Messages: 7
Offline
|
I have been using the Web service and now I'm switching to doing my own reverse geocoding, mostly for speed. I have an issue to discuss that might have a natural explanation but also might represent some bad data.
Specifically, I downloaded the file AllCountries.txt and noticed some duplicate coordinates
There are a lot of entries with duplicate lat/long pairs. Some of these end up in different political entries, such as (to pick an arbitrary example) "Thacker Slough", which is listed twice, once in Nevada and once in Idaho.
Other entries have duplicate lat/long but differing names, such as Muret and Cirstanesti, each sharing the coordinates 45.05, 24.05
I realize there could be very good reasons why the same lat/long appears twice, but I wanted to check to ensure that there aren't any errors in the data before I start my own reverse geocoding.
|
|
|
02/03/2008 17:06:19
|
marc
Joined: 08/12/2005 07:39:47
Messages: 4439
Offline
|
The same coordinate for two different places is normally a rounding problem. Some data sources round their coordinate to 30''
Marc
|
|
|
|
26/03/2008 14:34:26
|
chris_LE
Joined: 26/03/2008 14:16:16
Messages: 1
Location: Leipzig/Germany
Offline
|
rcombis wrote:
Here is what I did:
Create a column in the main geonames table called gPoint with POINT as the data Type. Allow nulls at first.
Populate the field like so:
UPDATE geonames3
SET gPoint= PointFromText(CONCAT('POINT(',longitude,' ',latitude,')'));
Set the column to not allow nulls. (Index won't be created if you don't do this)
Create a spatial index on the gPoint column. (This takes a long time, I left the office after it ran for 30 minutes and it was still going)
The where statement becomes:
MBRContains(GeomFromText('POLYGON((" + dMinLong.ToString + " " + dMinLat.ToString + ", " + dMaxLong.ToString + " " + dMinLat.ToString + ", " + dMaxLong.ToString + " " + dMaxLat.ToString + "," + dMinLong.ToString + " " + dMaxLat.ToString + "," + dMinLong.ToString + " " + dMinLat.ToString + ")'),gPoint)
The searches are now lightening fast with caching turned off.
My next step will be to add in the precision search to narrow down the results.
I may put the distance calculation in a MySQL function.
Here are some good links:
http://groups.google.com/group/kml-support/web/getting-started-with-spatial-mysql
http://www.webdev.cygad.net/docs/556/docs/mysql/manual/en/manual_Spatial_extensions_in_MySQL.html#Relations_on_geometry_MBR
I am using MySQL version 5.0.45 community edition.
-Rob
Correct me if I'm wrong, but I must *STRONGLY* recommend not to do this as the MYSQL spatial implementation is not compatibable with a geo-centric dataset:
In MySQL, the SRID value is just an integer associated with the geometry value. All calculations are done assuming Euclidean (planar) geometry. ...
...A Polygon is a planar Surface representing a multisided geometry. It is defined by a single exterior boundary and zero or more interior boundaries, where each interior boundary defines a hole in the Polygon.
see:
http://forums.mysql.com/read.php?23,40909,43079
AND
http://dev.mysql.com/doc/mysql/search.php?version=5.0&q=planar&from=%2Fdoc%2Frefman%2F5.0%2Fen%2Fgeneral-geometry-property-functions.html&lang=en
I have tested the spatial extension myself with the data provided by geonames-db and the result was the following:
While calculations were indeed "lightning fast" using spatial indexes the calculation was off by around 4km in short range(<100km) distances and off by around 300-400km in long-range distances (>5000km).
I therfore can only recommend to read the MySQL-documentation again, and to use the spheric cos/sin calculations (great circle distance formula) or to use another RDBMS
But that's just my two cents
CVH
|
|
|
16/05/2008 15:46:53
|
giorgio79
Joined: 21/04/2008 17:05:33
Messages: 28
Offline
|
If you wondered how to add a POINT column, here it is
http://dev.mysql.com/doc/refman/5.0/en/creating-spatial-columns.html
|
|
|
16/05/2008 17:26:42
|
giorgio79
Joined: 21/04/2008 17:05:33
Messages: 28
Offline
|
When I try and copy the original example query into phpmyadmin sql, I am getting errors
Code:
Select geonameid, admin1_code, admin2_code, name, latitude, longitude ,3963.191 * ACOS((SIN(radians(?latA))*SIN(radians(latitude))) +(COS(radians(?latA))*cos(radians(latitude))*COS(radians(longitude)-radians(?longA)))) AS distance
FROM geoname where (Latitude >= ?latMinRange AND Latitude <= ?latMaxRange AND Longitude >= ?longMinRange AND Longitude <= ?longMaxRange)
AND 3963.191 * ACOS((SIN(radians(?latA))*SIN(radians(latitude))) + (COS(radians(?latA))*cos(radians(latitude))*COS(radians(longitude)-radians(?longA)))) <= 1.5 order by distance
Anyone could tell me what is wrong here?
PS: I realized the ? values are variables which I have to substitute
|
|
|
06/07/2008 05:38:51
|
vadimberman
Joined: 01/07/2008 02:51:43
Messages: 3
Location: Melbourne, Australia
Offline
|
Works like charm, but a warning already voiced: use it only if you do searches within a neighbourhood, or a suburb, or a city.
Anything bigger than that already doesn't suit the assumption that the earth is flat .
|
|
|
27/08/2008 22:07:42
|
giorgio79
Joined: 21/04/2008 17:05:33
Messages: 28
Offline
|
oh jeah baby
http://www.mysqlfulltextsearch.com/geo_search.pdf
|
|
|
30/11/2008 23:13:13
|
chipster
Joined: 28/11/2008 15:31:56
Messages: 2
Offline
|
Hello everyone,
i try de resolve this problem : get the nearest loc ( or with scope ) in meter from a point with LAT / LONG.
I tried to follow your example ( Gpoint, spatial index,...) but it dones nothing ( seems distance procedure doesnt work...). Here is what i have :
SQL QUERY :
SET @g = PointFromText(CONCAT('POINT(',47.8500000,' ',1.7500000,')'));
select distance(gPoint,@g) as scope, asciiname
from geonames
HAVING scope < 5000
LIMIT 0 , 1
FUNCTION :
CREATE FUNCTION distance(a POINT, b POINT) RETURNS double
DETERMINISTIC
RETURN ( ROUND(GLength(LineStringFromWKB(LineString(AsBinary(a),
AsBinary(b))))) )
|
|
|
30/11/2008 23:17:35
|
chipster
Joined: 28/11/2008 15:31:56
Messages: 2
Offline
|
here is what it works for me ( distance in KM ) :
SELECT (
(
ACOS( SIN( 47.85 * PI( ) /180 ) * SIN( latitude * PI( ) /180 ) + COS( 47.85 * PI( ) /180 ) * COS( latitude * PI( ) /180 ) * COS( (
1.75 - longitude
) * PI( ) /180 ) ) *180 / PI( )
) *60 *1.75
) AS distance, asciiname
FROM geonames
HAVING distance <= '2'
ORDER BY distance ASC
LIMIT 0 , 1
-----------------------------------
Problem is that it takes...23.6905 sec...i can do way better with parsing files by prog
|
|
|
29/07/2016 12:49:07
|
scaredOfSql
Joined: 20/07/2016 06:33:05
Messages: 1
Offline
|
I know this is an old topic, but can someone tell me how the value 1.75 is derived in the last part of the equation (*60*1.75). Originally I thought it was the longitude value, but after reading http://stackoverflow.com/a/389251/691053 I think the value should probably be 1.851999999962112 (1.1507794480*1.609344=1.851999999962112).
|
|
|
27/08/2016 10:55:04
|
thuha123
Joined: 12/08/2016 11:29:58
Messages: 1
Offline
|
I know this is an old topic, but can someone tell me how the value 1.75 is derived in the last part of the equation (*60*1.75). Originally I thought it was the longitude value, but after reading http://stackoverflow.com/a/389251/691053 I think the value should probably be 1.851999999962112 (1.1507794480*1.609344=1.851999999962112).
|
|
|
|