| Author |
Message |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 20/07/2007 00:41:25
|
yougodo
Joined: 20/07/2007 00:25:28
Messages: 11
Offline
|
Hi, firstly - cheers for such an excellent source of geo data!
I'm wondering if any of you geo data gurus can help me out?
I've downloaded the data and am doing proximity searches on it, finding all places within 100 kms for example of a lon lat. The only trouble is that this data takes about 2 mins to return results against a single lon lat. I'm wondering if anyone has any techniques to speed up the search results? Maybe some SQL WHERE clause I can apply to the lon lats in order to arrive at a subset of data to perform calculations on?
My simplified SQL (MS SQL Server 2005) is below:
Code:
DECLARE @GeoNameID int
SET @GeoNameID = 2656031
DECLARE @EarthRadius decimal(10,3)
SET @EarthRadius = 6378.137 -- km
DECLARE @MaxDistance int
SET @MaxDistance= 100
DECLARE @Latitude decimal(23,20), @Longitude Decimal(23,20)
SELECT @Latitude = Latitude, @Longitude = Longitude FROM GeoName WHERE GeoNameID = @GeoNameID
SELECT geo.GeoNameID,
ROUND(@EarthRadius * ACOS((SIN(RADIANS(@Latitude)) * SIN(RADIANS(geo.Latitude))) +
(COS(RADIANS(@Latitude)) * COS(RADIANS(geo.Latitude)) *
COS(RADIANS(geo.Longitude) - RADIANS(@Longitude)))),0) AS Distance
FROM GeoName AS geo
WHERE ROUND(@EarthRadius * ACOS((SIN(RADIANS(@Latitude)) * SIN(RADIANS(geo.Latitude))) +
(COS(RADIANS(@Latitude)) * COS(RADIANS(geo.Latitude)) *
COS(RADIANS(geo.Longitude) - RADIANS(@Longitude)))),0) <= @MaxDistance
I'm thinking the best way to increase query speed would be to reduce the number of calculations I need to perform by first defining a search box from which I can do a WHERE clause using lon & lat. If I'm looking for all matches within 100 kms I'd first calculate the latitude 100 kms north and south, and the longitude 100 kms east and west. Then the more complex calculation shown above would then be applied to the subset of geo records found within these boundaries. With an index applied to the longitude and latitude columns I'm guessing significant speed increases would be experienced as this should prevent me needing to perform calculations on every single location listed.
However I'm not sure how to calculate the differences in longitude and latitude from a given point for a certain distance. Can anyone help me out here and perhaps confirm my theory is correct?
Any pointers / suggestions would be greatly appreciated.
Cheers,
Gavin.
|
Gavin Harriss
Portfolio: www.gavinharriss.com |
|
|
 |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 20/07/2007 19:17:10
|
marc
Joined: 08/12/2005 07:39:47
Messages: 2183
Offline
|
Hi Gavin
I fear SQL Server does not have inbuilt spatial support.
There is a spatial extension for mssql 2005 :
http://www.codeplex.com/MsSqlSpatial
And a microsoft paper :
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql90/html/TblValFuncSQL.asp
The next version of mssql server will have geospatial support :
http://industry.slashgeo.org/article.pl?sid=07/05/11/1329253
In any case you probably can do without geospatial functions and use
simple indices on the lat and lng columns. Use a rectangular bounding
box for your search large enough to include all your records and filter
out the rest based on exact distance. As a rule of thumb you can assume
one degree is around 111km (=40.000/360).
Hope this helps
Marc
|
 |
|
|
 |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 22/07/2007 22:37:40
|
yougodo
Joined: 20/07/2007 00:25:28
Messages: 11
Offline
|
Hi Marc, thanks for your advice - very helpful and very much appreciated.
|
Gavin Harriss
Portfolio: www.gavinharriss.com |
|
|
 |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 23/07/2007 00:56:10
|
yougodo
Joined: 20/07/2007 00:25:28
Messages: 11
Offline
|
Hi Marc, my geo data queries are now super fast! However I do notice that significantly less records are found when I apply my rough lan long box criteria to the search. Just wondering if you have any idea why this might be? Am I making some obvious mistake?
My SQL with the lan long box applied:
Code:
DECLARE @GeoNameID int
SET @GeoNameID = 2656031
DECLARE @MaxDistance int
SET @MaxDistance = 50 -- km
DECLARE @EarthRadius decimal(10,3)
SET @EarthRadius = 6378.137 -- km
DECLARE @EarthCircumference int
SET @EarthCircumference = 40076 -- km
-- Degrees difference for distance is only an approximisation to speed up search via WHERE clause!
DECLARE @DegDiff float
SET @DegDiff = CAST(@MaxDistance AS float) / CAST((@EarthCircumference / 360) AS float)
-- Get latitude and longitude
DECLARE @Latitude decimal(23,20), @Longitude decimal(23,20)
SELECT @Latitude = Latitude, @Longitude = Longitude FROM yougodo_GeoName WHERE GeoNameID = @GeoNameID
DECLARE @MinLatitude decimal(23,20), @MaxLatitude decimal(23,20), @MinLongitude decimal(23,20), @MaxLongitude decimal(23,20)
SET @MinLatitude = @Latitude - @DegDiff
SET @MaxLatitude = @Latitude + @DegDiff
SET @MinLongitude = @Longitude - @DegDiff
SET @MaxLongitude = @Longitude + @DegDiff
-- Search
SELECT geo.GeoNameID,
ROUND(@EarthRadius * ACOS((SIN(RADIANS(@Latitude)) * SIN(RADIANS(geo.Latitude))) +
(COS(RADIANS(@Latitude)) * COS(RADIANS(geo.Latitude)) *
COS(RADIANS(geo.Longitude) - RADIANS(@Longitude)))),0) AS Distance
FROM yougodo_GeoName AS geo
INNER JOIN yougodo_GeoFeature AS ftr ON ISNULL(geo.FeatureClass, '') = ftr.FeatureClass
AND ISNULL(geo.FeatureCode, '') = ftr.FeatureCode
WHERE geo.GeoNameID <> @GeoNameID
AND (geo.Latitude >= @MinLatitude AND geo.Latitude <= @MaxLatitude AND geo.Longitude >= @MinLongitude AND geo.Longitude <= @MaxLongitude)
AND ROUND(@EarthRadius * ACOS((SIN(RADIANS(@Latitude)) * SIN(RADIANS(geo.Latitude))) +
(COS(RADIANS(@Latitude)) * COS(RADIANS(geo.Latitude)) *
COS(RADIANS(geo.Longitude) - RADIANS(@Longitude)))),0) <= @MaxDistance
my SQL without the lan lon box:
Code:
DECLARE @GeoNameID int
SET @GeoNameID = 2656031
DECLARE @MaxDistance int
SET @MaxDistance = 50 -- km
DECLARE @EarthRadius decimal(10,3)
SET @EarthRadius = 6378.137 -- km
DECLARE @EarthCircumference int
SET @EarthCircumference = 40076 -- km
-- Degrees difference for distance is only an approximisation to speed up search via WHERE clause!
DECLARE @DegDiff float
SET @DegDiff = CAST(@MaxDistance AS float) / CAST((@EarthCircumference / 360) AS float)
-- Get latitude and longitude
DECLARE @Latitude decimal(23,20), @Longitude decimal(23,20)
SELECT @Latitude = Latitude, @Longitude = Longitude FROM yougodo_GeoName WHERE GeoNameID = @GeoNameID
DECLARE @MinLatitude decimal(23,20), @MaxLatitude decimal(23,20), @MinLongitude decimal(23,20), @MaxLongitude decimal(23,20)
SET @MinLatitude = @Latitude - @DegDiff
SET @MaxLatitude = @Latitude + @DegDiff
SET @MinLongitude = @Longitude - @DegDiff
SET @MaxLongitude = @Longitude + @DegDiff
-- Search
SELECT geo.GeoNameID,
ROUND(@EarthRadius * ACOS((SIN(RADIANS(@Latitude)) * SIN(RADIANS(geo.Latitude))) +
(COS(RADIANS(@Latitude)) * COS(RADIANS(geo.Latitude)) *
COS(RADIANS(geo.Longitude) - RADIANS(@Longitude)))),0) AS Distance
FROM yougodo_GeoName AS geo
INNER JOIN yougodo_GeoFeature AS ftr ON ISNULL(geo.FeatureClass, '') = ftr.FeatureClass
AND ISNULL(geo.FeatureCode, '') = ftr.FeatureCode
WHERE geo.GeoNameID <> @GeoNameID
AND ROUND(@EarthRadius * ACOS((SIN(RADIANS(@Latitude)) * SIN(RADIANS(geo.Latitude))) +
(COS(RADIANS(@Latitude)) * COS(RADIANS(geo.Latitude)) *
COS(RADIANS(geo.Longitude) - RADIANS(@Longitude)))),0) <= @MaxDistance
If anything, due to the way lan lons work (from my limited understanding) I should always be defining a box that's probably a little too big which is fine as the second more thorough search criteria comes in to play afterwards. Maybe I've applied the lon lat box calculations slightly wrong?
Cheers, Gavin.
|
Gavin Harriss
Portfolio: www.gavinharriss.com |
|
|
 |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 25/07/2007 22:16:22
|
yougodo
Joined: 20/07/2007 00:25:28
Messages: 11
Offline
|
Just adding a few observations and tips for anyone who stumbles upon this thread with similar requirements.
1) I got around the problem of my rough search box supplying less results by simply slightly increasing the dimensions. Not the most elegant solution, but seems to do the job.
2) For MS SQL Server users out there. SQL intermittently encounters a "A domain error occurred." error when performing the distance calculations. After some investigation I found that it seems to have trouble handling calculations that result in more than 15 digits after the decimal place. So you'll see my calculation rounds part of the calculation before applying the rest of the calculation to it in order to get around this problem.
My code for anyone who might find it useful for adaptation is below. For re-usability I've implemented it as a function that returns a table of resulting IDs and their distances.
Code:
CREATE FUNCTION [Geo_GetNearbyLocations]
(
@GeoNameID int,
@Latitude decimal(23,20),
@Longitude decimal(23,20),
@MaxDistance int
)
RETURNS @GeoNameIDs TABLE
(
GeoNameID int NOT NULL,
Distance int NOT NULL
)
AS
BEGIN
/*
Inspiration taken from:
<a href="http://forum.geonames.org/gforum/posts/list/0/522.page" target="_blank" rel="nofollow">http://forum.geonames.org/gforum/posts/list/0/522.page</a>
*/
IF @GeoNameID IS NOT NULL OR (@Latitude IS NOT NULL AND @Longitude IS NOT NULL)
BEGIN
/*
Earth's radius:
6378137 meters
6378.137 km
3963.191 miles
3441.596 nautical miles
*/
DECLARE @EarthRadius decimal(10,3)
SET @EarthRadius = 6378.137 -- km
DECLARE @EarthCircumference int
SET @EarthCircumference = 40076 -- km
-- Degrees difference for distance is only an approximisation to speed up search via WHERE clause!
DECLARE @DegDiff float
SET @DegDiff = CAST(@MaxDistance AS float) / CAST((@EarthCircumference / 360) AS float)
-- NOTE: Add some error tolerance as too few results are returned else for some reason?!?
SET @DegDiff = @DegDiff + CAST(@DegDiff * 0.75 AS float) -- Error tolerance
-- Get latitude and longitude
IF NOT @GeoNameID IS NULL
SELECT @Latitude = Latitude, @Longitude = Longitude FROM GeoName WHERE GeoNameID = @GeoNameID
DECLARE @MinLatitude decimal(23,20), @MaxLatitude decimal(23,20), @MinLongitude decimal(23,20), @MaxLongitude decimal(23,20)
SET @MinLatitude = @Latitude - @DegDiff
SET @MaxLatitude = @Latitude + @DegDiff
SET @MinLongitude = @Longitude - @DegDiff
SET @MaxLongitude = @Longitude + @DegDiff
-- NOTE: There is a Geo_CalculateDistance function but it's far faster to perform calculations inline!
-- Search
INSERT INTO @GeoNameIDs (GeoNameID, Distance)
SELECT geo.GeoNameID,
ROUND(@EarthRadius * ACOS(ROUND(
(SIN(RADIANS(@Latitude)) * SIN(RADIANS(geo.Latitude))) +
(COS(RADIANS(@Latitude)) * COS(RADIANS(geo.Latitude)) *
COS(RADIANS(geo.Longitude) - RADIANS(@Longitude))),15)),0) AS Distance
FROM GeoName AS geo
WHERE
-- Below is rough proximity search to super speed the query up
(geo.Latitude >= @MinLatitude AND geo.Latitude <= @MaxLatitude AND geo.Longitude >= @MinLongitude AND geo.Longitude <= @MaxLongitude)
-- Below should result in accurate proximity search
AND ROUND(@EarthRadius * ACOS(ROUND(
(SIN(RADIANS(@Latitude)) * SIN(RADIANS(geo.Latitude))) +
(COS(RADIANS(@Latitude)) * COS(RADIANS(geo.Latitude)) *
COS(RADIANS(geo.Longitude) - RADIANS(@Longitude))),15)),0) <= @MaxDistance
ORDER BY @EarthRadius * ACOS(ROUND(
(SIN(RADIANS(@Latitude)) * SIN(RADIANS(geo.Latitude))) +
(COS(RADIANS(@Latitude)) * COS(RADIANS(geo.Latitude)) *
COS(RADIANS(geo.Longitude) - RADIANS(@Longitude))),15)) ASC
END
RETURN
END
|
Gavin Harriss
Portfolio: www.gavinharriss.com |
|
|
 |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 03/11/2007 23:32:59
|
snives
Joined: 03/11/2007 22:46:15
Messages: 1
Offline
|
Your idea to use a bounding box to filter records is valid. It is used in all commercial spacial search functions. In fact they all rely heavily on indexes to perform covermap filtering based on various techniques, but the bounding box is a good start.
You are missing zip codes because your bounding box calculation is incorrect. The number of miles in a degree of longitude varies by the cosine of the latitude. Think of the equator vs the 89th parallel, there is alot more earth between the lines at the equator, right?
Use this instead...
Code:
DECLARE @LatRange float
DECLARE @LongRange float
SET @LatRange = @Miles / 69.0499
SET @LongRange = @Miles / (69.0499 * cos(Radians(@Latitude)) )
You are also implementing the ACOS trig calculation which is susceptable to error over small distances. Use the Haversine method for more accurate small distance calculations. If most of your distance calculations are small distances, you can optimize your distance calculation even further by using simple euclidean geometry (insanely fast). I've found it to be accurate to within 10 ft per Mile, this is plenty accurate if your search is just to filter records.
I bet your code is paralizingly slow too. Here is a tip, create a table valued function to return a subset of GeoNameId's that match your boundary condition. Then create a function to calculate the distance between lat/longs. Then index on your GeoNameId (which I assume you already do).
Here is your final product FTW. It searches 40,000 records and returns 56 zips (on my 3yr old laptop) in 10ms! I suspect further speed improvements may be possible using spacial search in Sql Server 2008.
Code:
select City, ZipCode, dbo.[GetApproxDistanceBetweenLatLong](41.882582, -87.637601, Latitude, Longitude) as Actual
from Zip
join dbo.fZipsNearCoverMap(41.882582, -87.637601, 10) covermap on Zip.ZipId = covermap.ZipId
and dbo.[GetApproxDistanceBetweenLatLong](41.882582, -87.637601, Latitude, Longitude) <= 10
order by Actual desc
More info at: http://technet.microsoft.com/en-us/library/aa964138.aspx
|
|
|
 |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 04/11/2007 00:14:40
|
yougodo
Joined: 20/07/2007 00:25:28
Messages: 11
Offline
|
Hi snives,
Thanks for sharing your optimisation tips with me - very much appreciated! I'll be trying them out in the next day or two to see how much an improvement I can get out of them.
The location search code is very much the bottleneck being experienced in my code at the moment. It's being used by http://www.yougodo.com - still a work in progress at present, but nearly there.
Cheers again,
Gavin.
|
Gavin Harriss
Portfolio: www.gavinharriss.com |
|
|
 |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 13/11/2007 02:13:05
|
yougodo
Joined: 20/07/2007 00:25:28
Messages: 11
Offline
|
Hi snives,
I've only just started to get around to revisiting my geo proximity search code and was wondering if you would mind assisting me a little further?
In your post you reference 2 functions - GetApproxDistanceBetweenLatLong and fZipsNearCoverMap. Would you mind posting the complete SQL for these so I can t a clearer grasp what you are doing and how to apply the Haversine method myself?
I believe I understand basically what you're suggesting by separating out the "box" search part of the SQL in to a separate function and am just about to experiment with this to determine differences in query execution performance gains.
Thanks again for you assistance,
Gavin.
|
Gavin Harriss
Portfolio: www.gavinharriss.com |
|
|
 |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 08/04/2008 05:50:07
|
yougodo
Joined: 20/07/2007 00:25:28
Messages: 11
Offline
|
A friendly chappie named Colin Jackman picked up on snives suggestions and had a bit of a email dialogue going with me as he applied snives suggestions to my original solution. So for anyone out there interested in a better / faster proximity search solution here you go...
Code:
CREATE FUNCTION [dbo].[GetNearbyLocations]
(
@GeoNameID int,
@Latitude decimal(23,20),
@Longitude decimal(23,20),
@MaxDistance int
)
RETURNS @GeoNameIDs TABLE
(
GeoNameID int NOT NULL,
Distance int NOT NULL
)
AS
BEGIN
/*
Inspiration taken from:
<a href="http://forum.geonames.org/gforum/posts/list/0/522.page" target="_blank" rel="nofollow">http://forum.geonames.org/gforum/posts/list/0/522.page</a>
*/
IF @GeoNameID IS NOT NULL OR (@Latitude IS NOT NULL AND @Longitude IS NOT NULL)
BEGIN
-- Convert to decimal to avoid rounding errors
DECLARE @MaxDistanceDec decimal
SET @MaxDistanceDec = @MaxDistance
/*
Earth's radius:
6378137 meters
6378.137 km
3963.191 miles
3441.596 nautical miles
*/
DECLARE @EarthRadius decimal(10,3)
SET @EarthRadius = 6378.137 -- km
-- Get latitude and longitude
IF NOT @GeoNameID IS NULL
SELECT @Latitude = Latitude, @Longitude = Longitude FROM yougodo_GeoName WHERE GeoNameID = @GeoNameID
DECLARE @LatRange decimal(23,20)
DECLARE @LongRange decimal(23,20)
SET @LatRange = (@MaxDistanceDec / 111) -- 111 is number of km in 1 degree lat
SET @LongRange = ((@MaxDistanceDec / 111) / cos(Radians(@Latitude)))
DECLARE @MinLatitude decimal(23,20), @MaxLatitude decimal(23,20), @MinLongitude decimal(23,20), @MaxLongitude decimal(23,20)
SET @MinLatitude = @Latitude - @LatRange
SET @MaxLatitude = @Latitude + @LatRange
SET @MinLongitude = @Longitude - @LongRange
SET @MaxLongitude = @Longitude + @LongRange
-- Search
-- NOTE: I did create a CalculateDistance function but it's far faster to perform calculations inline!
INSERT INTO @GeoNameIDs (GeoNameID, Distance)
SELECT geo.GeoNameID,
ROUND(@EarthRadius * ACOS(ROUND(
(SIN(RADIANS(@Latitude)) * SIN(RADIANS(geo.Latitude))) +
(COS(RADIANS(@Latitude)) * COS(RADIANS(geo.Latitude)) *
COS(RADIANS(geo.Longitude) - RADIANS(@Longitude))),15)),0) AS Distance
FROM GeoName AS geo
WHERE -- Below is rough proximity search to super speed the query up
(geo.Latitude >= @MinLatitude AND geo.Latitude <= @MaxLatitude AND geo.Longitude >= @MinLongitude AND geo.Longitude <= @MaxLongitude)
-- Below should result in accurate proximity search
AND ROUND(@EarthRadius * ACOS(ROUND(
(SIN(RADIANS(@Latitude)) * SIN(RADIANS(geo.Latitude))) +
(COS(RADIANS(@Latitude)) * COS(RADIANS(geo.Latitude)) *
COS(RADIANS(geo.Longitude) - RADIANS(@Longitude))),15)),0) <= @MaxDistanceDec
ORDER BY 2 ASC
END
RETURN
END
|
Gavin Harriss
Portfolio: www.gavinharriss.com |
|
|
 |
|
|
|
|