ysmith
Joined: 03/07/2013 19:35:39
Messages: 1
Offline
|
While working with Geonames data in our project we noticed some issues in toponyms hierarchy.
For example, Czech Republic. While being generally well structured (thousands of its PPL* are distributed among ADM*) there are still hundreds of PPL* with population >100 which do not belong to any ADM* (e.g. 3075921, 3068873, 3077882, etc.). Checking Wikipedia for one of them, e.g. 3068873 (Orlová), one can see that it belongs to Moravian-Silesian Region which can be found in Geonames with id=3339573. Similar situation with many other such orphans.
Another case, Indonesia. There are:
PPL* - over 8000 (including PPLA-PPLA4)
ADM1 ~30
ADM2 ~280
ADM3 ~2400
ADM4 ~24000
The fact that there are ADM1-4 and PPLA-PPLA4 could mean that quite detailed hierarchy was planned for Indonesia. But practically the structure is very flat: all PPL*, ADM2, ADM3 and ADM4 are all directly under ADM1's. So we miss several levels of hierarchy. As a side effect there are many namesake PPL* under one parent, which is confusing for the system we're developing (related issue - http://forum.geonames.org/gforum/posts/list/3159.page).
Similar issues exist in China and some other Asian countries.
The questions:
1. Is it just incomplete data?
2. Any chance that it will be fixed in the nearest future?
3. Can we still retrieve the missing hierarchies from Geonames somehow?
PS. Still there are countries perfectly organized in Geonames. For example, in Italy all PPL's are assigned to ADM's on different levels.
|