| Author |
Message |
|
|
|
I'm usually pretty lenient and flexible when it comes to alternate names. But "City" and "The City" being alternate names for "London" is incorrect, ridiculous, self-centered for whoever put it in, and not even correct if we considered most scales for measuring "city"ness. I'd delete it, but I don't have permission.
|
 |
|
|
Yeah, I'd go so far as to say they are flat out wrong.
Though, I guess they are all within Turkey of the ones I've checked. I guess that's something .
So you're sure there isn't some import problem, the underlying data is like this?
|
 |
|
|
|
These appear highly inaccurate or something weird is going on. Search for Istanbul and see all the postal codes in Istanbul province (Istanbul = Adm1). I haven't found a single one yet that's actually in the Istanbul province. This leads me to believe there's some serious data problem here, either with the adm columns or the lat/lons.
|
 |
|
|
|
More specifically, the allCountries.zip file doesn't have them. The full file appears to have it.
|
 |
|
|
|
What happened to the British Adm1 code column in the postal file? It's missing those values and I'm sure they were there before.
|
 |
|
|
Brooklyn (5110302) is not a county (adm2) in New York, USA. Kings County is it's real name and it just happens to be coextensive with the Borough of Brooklyn. Brooklyn should be an alternate name and Kings County the real one, as Manhattan is with New York County, for example.
http://en.wikipedia.org/wiki/Kings_County,_New_York.
|
 |
|
|
|
GB postal codes AB10-13,AB15-16,AB21 currently have a city listed of "country". I don't what the most specific city to list for these is, but Aberdeen would be a big improvement.
|
 |
|
|
I sort of disagree with both the lack of data and the need for data. Isn't the point of geonames to be a repository of sharing and updating location data? It seems silly to hold back what data can be put in because you personally don't have it. Maybe I want to sit down for a night and link points with relationships. I've seen people laboriously putting their hotels in and I don't think you have a hotel data feed in geonames.
The most obvious short-term advantage is the merge of the postal code DB into the location DB to make them editable points. It's not a good long term solution to tell you every time I find an individual postal code centroid is off (though you do respond very quickly).
Plus this is not exactly a complex model. I'm talking about one extra self join relationship table for a lot of extra expression. If you control the relationship types that exist much like feature codes are controlled, it won't get much more complex than it currently is.
|
 |
|
|
I'm sorry I never got back to you on this. I don't have a perfect solution for postal codes in geonames. In my mind, geonames is missing two fundamental things for an ideal semantic solution: a concept of boundaries and a concept of relationships. Boundaries are rough because they get very complex and not that you shouldn't do it yourself if your system requires it, but it might be too much for geonames. Geonames has a niche as a logical point-in-space datasource. It would have to fairly dramatically change it's focus to handle geometric space.
What I think geonames needs is a more generic relationship table. The alternate names is currently being overloaded for this purpose for postal codes. Right now there are only a couple relationships that are really available in the system: admin1, admin2, and country. If those columns weren't part of the main table and instead were in a linked table you could do more interesting things. This could solve a lot of issues that are being brought up in other places on the forum too. Concepts of neighborhoods, postal codes, nesting of levels beyond admin2 in depth - all of these things can be solved/improved with another table. In my mind, this is going to be necessary going forward to be a useful service. So postal code would be another feature code and placed at it's centroid and then linked via a postal code-city relationship to the PPLs they cover.
|
 |
|
|
While I understand this postal code linking is useful for a lot of people, let me just express my arguments for the record. Postal codes are not alternate names in any way, I now need to strip them out from the DB before I use it to match on a city name. They are separate geographical entities with a M:M mapping with cities. Using them as alternate names gives the false illusion of simplicity. A different solution should have been devised.
Plus on a minor note, look up New York City, the alternate names list looks crappy.
|
 |
|
|
|
I don't believe it is in Spokane County, since there isn't a Spokane County in North Carolina. I believe it should have an ADM2 of "Orange", since I believe all of Chapel Hill is in Orange County.
|
 |
|
|
|
On this note, I just ran a duplicate detector on the DB...where I defined a duplicate to be anything with the same name, feature code, and all the same hierarchical breakdown (country,adm1,adm2). There are a very large number of duplicates in the database. I'm sure some of these may actually be different places, but the large majority that I saw were definitely import errors. Is there a strategy to handle these?
|
 |
|
|
Try searching for San Francisco, Cuba and tell me what in the world went on there. Did some input script mess up or was the original data that dirty?
|
 |
|
|
Again, I'm not sure these are correct, but they are definitely a lot better. I wish I could go through the whole list and manually check them all, but I don't have time.
96812: 21.3061, -157.8585
96830: 21.2841, -157.8340
96858: 21.3416, -157.8918
|
 |
|
|
I don't have very accurate info, but I'll take a stab. I guarantee that they at least make sense.
99605: 60.872585,-149.468293
99693: 60.7734, -148.6839
99741: 65.8263, 154.4373
99689: 59.8120, -139.5505
|
 |
|
|
I haven't checked them all by any means, but at least three Hawaii zip codes geocoded poorly.
96812 and 96830 are Honolulu zip codes that are given lat/lons between Midway Islands and Hawaii. 96858 is Fort Shaftner and this is also in the middle of the Pacific.
|
 |
|
|
|
The New York City zip codes are listed as "New York" zip codes. Since the official name in geonames is New York City, can that be changed?
|
 |
|
|
|
I began adding US metro areas last night as "economic region" (RGNE). The table I was drawing from was here: http://en.wikipedia.org/wiki/Table_of_United_States_Core_Based_Statistical_Areas. These are agglomerations of populated areas that have begun to assume a cohesive entity for local description purposes and population size (i.e. Tampa Bay for the cities close to Tampa, FL). Generally this has nothing to do with administrative division. I was wondering if economic region RGNE was the correct feature code or maybe PPLS would be a better code? I wasn't quite sure the semantics of those.
|
 |
|
|
There are many wrong postal code lat/lons in Alaska. I originally thought it was just because they covered such a large area, they looked funny, but some of these are actually being put in cities with other zip codes, so I don't think that's it. Maybe old zip code info or something? I haven't checked them all, but here are a couple that came up recently:
99605 -> For Hope, but in Ninilchik which has it's own zip code
99693 -> Nowhere near Whittier and on the other side of Valdez which has it's own zip code
99741 -> Several cities (with their own zip codes) down from Galena
99689 -> For Yukatat, but placed in between Juneau and Gustavus which both have their own zip codes
|
 |
|
|
|
This zip code is in the middle of the Boston Harbor when it should be close to downtown Boston. Where is this data coming from?
|
 |
|
|