GeoNames Home | Postal Codes | Download / Webservice | About 

GeoNames Forum
  [Search] Search   [Recent Topics] Recent Topics   [Groups] Back to home page 
[Register] Register / 
[Login] Login 
Accuracy of Australian postal codes  XML
Forum Index -> Postal Codes
Author Message
thsutton



Joined: 23/04/2009 06:40:49
Messages: 4
Offline

I'm working on an application that needs to do geographical searches based on (Australian) postcodes. I've been using the geonames.org data, but it seems to be fairly inaccurate.

Taking 6003 in Western Australia as an example, geonames returns for two suburbs, both at -31.967/115.806. These co-ordinates are, in fact, in third suburb (Daglish by my reckoning) which is not even adjacent to the two in question.

Geocoding "6003, Western Auatralia" using Google returns -31.9456066/115.8632664, but the Google data can only be used to display data on a Google maps interface.

A similar figure of -31.943361/115.862361 results when calculating the centroid of the "6003" Postal Area defined by the Australian Bureau of Statistics in the 2006 census (Postal Areas are not identical to postcodes, but they're the nearest I could find and available under the cc-by 2.5 au licence). This isn't perfect, nor complete (the ABS used 2507 PAs in 2006, Australia Post defines 2636 geographical postcodes as of March 2009, and there are 2861 in the geonames database), but seems better than the current data, at least in this case.

Would it be useful to submit the centroids derived from the ABS data for the geonames.org database?
marc



Joined: 08/12/2005 07:39:47
Messages: 4501
Offline

this would be a great contribution. Thanks a lot.

Marc

[WWW]
thsutton



Joined: 23/04/2009 06:40:49
Messages: 4
Offline

What format is best? I've got a tab separated file containing postcode and WKT in WSG84. Is this OK?
marc



Joined: 08/12/2005 07:39:47
Messages: 4501
Offline

sounds perfect.

Marc

[WWW]
thsutton



Joined: 23/04/2009 06:40:49
Messages: 4
Offline

There are two tab-separated files attached: one with postcodes and well-known text and the other with postcodes, latitude and longitude. The locations *should* be in WGS 84 (I asked the Django GIS libs to transform them).

They are derived from 2923.0.30.001 - Census of Population and Housing: Census Geographic Areas Digital Boundaries, Australia, 2006 which is, as far as I can determine, available under the Creative Commons Attribution 2.5 Australia licence). They have citation guidelines linked from that page.

Edit: I ought to note that I'm relatively new to this, so it'd be good if someone could check that these data aren't completely nonsensical. I've checked a few, but I'm not sure how close they ought to be to, e.g., Google's geocoding.
 Description Centroids of the ABS postal areas with seperate lat/long fields. [Disk] Download
 Filesize 83 Kbytes
 Downloaded:  1471 time(s)

 Description Centroids of the ABS postal areas in well-known text format. [Disk] Download
 Filesize 135 Kbytes
 Downloaded:  1175 time(s)

marc



Joined: 08/12/2005 07:39:47
Messages: 4501
Offline

I am a little bit confused.
Look for instance at these two lines:
5700 137.689413382 -32.6049113419
5710 134.785990532 -29.3441490006

Shouldn't they be both nearby? 5700 looks ok, but I doubt that 5710 is really at the other side of the continent. I fear something went wrong when these guys extracted the data from the original files. Do you see which of the original files contains postal codes? Then we could use the original file instead.

Best

Marc

[WWW]
thsutton



Joined: 23/04/2009 06:40:49
Messages: 4
Offline

marc wrote:
I am a little bit confused.
Look for instance at these two lines:
5700 137.689413382 -32.6049113419
5710 134.785990532 -29.3441490006
 


If you type "5700, South Australia" or "5710, South Australia" into Google Maps or GetLatLon.com, then you'll get the following details:

(-32.6759825, 137.788737)

(-29.286398892934763, 134.89013671875)

These seem near reasonably accurate to me. On the other hand, this does put the centroid of 5710 near Coober Pedy which is in 5723. This looks to me like an artefact of the data (it's population data and they may have aggregated parts of 5723 into 5710, etc.) It's not *that* far away: Cook is nearby (on the map) and is in 5710.

Shouldn't they be both nearby? 5700 looks ok, but I doubt that 5710 is really at the other side of the continent. I fear something went wrong when these guys extracted the data from the original files. Do you see which of the original files contains postal codes? Then we could use the original file instead. 


Not necessarily. Australia Post numbers postcodes in whatever order is best for them - they are a purely administrative coding and they don't necessarily mean anything. They are not necessarily contiguous, adjacent areas need not have adjacent postcodes, they can have very strange geometry, can cross state borders, etc.

There is no freely available geographical data for postcodes per se (at least, not that I've been able to find). These data were extracted from the Postal Area shapefiles at the link I posted. As noted above, these are population areas that have been aggregated until they roughly match postcodes. In areas with high population densities, they should correlate reasonably well but in areas with low density (the middle of South Australia, in this case), I'd expect the correlation to be a little less accurate.

I'll need to wait until next week to investigate any further as all my data and software are on my work computers.

Cheers,

Thomas
marc



Joined: 08/12/2005 07:39:47
Messages: 4501
Offline

Thomas

I have updated the lat/lng with your list.

Best

Marc

[WWW]
rlevering



Joined: 20/07/2009 20:00:19
Messages: 26
Offline

I was pretty excited when I found this thread, since I had noticed some strangeness with Australian postal codes. However, when I downloaded the latest data file, I noticed that the postal codes via the web or the data dump haven't been updated as per the data file in this thread. Was this actually completed?
marc



Joined: 08/12/2005 07:39:47
Messages: 4501
Offline

When I look at the web service I see exactly the coordinates mentioned in this thread:
http://ws.geonames.org/postalCodeSearch?postalcode=6003&maxRows=10&style=full&country=au

Marc

[WWW]
rlevering



Joined: 20/07/2009 20:00:19
Messages: 26
Offline

http://ws.geonames.org/postalCodeSearch?postalcode=2034&maxRows=10&style=full&country=au

<lat>-33.9166667</lat>
<lng>151.2666667</lng>

centroids-xy.tsv:

2034 151.25464994 -33.9243611512
rlevering



Joined: 20/07/2009 20:00:19
Messages: 26
Offline

Any word on this? Could you finish updating the postal code lat/lons? They are embarassing in several cases when I view them on maps (putting them two cities over/in the middle of water).
rlevering



Joined: 20/07/2009 20:00:19
Messages: 26
Offline

I don't understand your English. However, if I interpret this defensively, I would be more than happy to update the lat/lons myself. However, there is no way to update things in bulk nor do I believe there is a way to change the zip code lat/lons at all. So I have to rely on marc to be able to change them with a script.
marc



Joined: 08/12/2005 07:39:47
Messages: 4501
Offline

Where possible we relate the postal code entries to the places in the main geoname table.
For instance Coogee: http://www.geonames.org/2170697/coogee.html

It is very simple to move a place to its proper location. It is also possible to add postal codes as alternate names using the pseudo language code 'post': http://www.geonames.org/manual.html

From time to time we will update the coordinates in the postal code table with the corrected/improved coordinates from the place table.

The centroids have been used to update those postal code lat/lng that do not have a match in the place name table or where the distance is more than just some 100m. For Coogee and many others the distance is less than some 100m and the coordinates from the geoname table are used.

Marc

[WWW]
rlevering



Joined: 20/07/2009 20:00:19
Messages: 26
Offline

1) As I've said on the mailing list, this is an incorrect model. Postal codes have a loose connection to political incorporations and often have one-to-many or many-to-one relationships. Therefore, I consider the "post" language code to be a hack that is unusable for serious applications where it's not okay being a town or two off. You really need a separate place entity that is a postal code and then perhaps you can use the alternate names to provide a useful mapping between the two. What is the problem with giving them identity so they can be pushed around like everything else in geonames?

2) I have no problem with the 100m thing since it might look better on maps if it's actually centered in a small town. However, it is 433m from the centroid to the Coogee placemark at it's current placement and 1401m to the previous more incorrect placemark in the water, so perhaps the 100m calculation is not working correctly.
marc



Joined: 08/12/2005 07:39:47
Messages: 4501
Offline

As long as this is a purely theoretical discussion I don't see a need to change anything. We can discuss about other data models when we really have a real life problem and have solved all other problems. At the moment if would be no improvement and would only cause confusion.

Marc

[WWW]
 
Forum Index -> Postal Codes
Go to:   
Powered by JForum 2.1.5 © JForum Team