GeoNames Home | Postal Codes | Download / Webservice | About 

GeoNames Forum
  [Search] Search   [Recent Topics] Recent Topics   [Groups] Back to home page 
[Register] Register / 
[Login] Login 
Messages posted by: bernard  XML
Profile for bernard -> Messages posted by bernard [30] Go to Page: 1, 2 Next 
Author Message
http://www.geonames.org/8282146 seems to be a PPL in Yemen, according to its parent ADM1 and country, but it's currently lost in the Gulf of Guinea.
The only name is in Arabic, I am not able to relocate it properly. Any idea?
Hi Lidia

I understand you want to express things like

<http://sws.geonames.org/3017382/> :isoALPHA3 "FRA"

Of course you can do that, but there are no such specific attribute(s) defined so far in the Geonames ontology. We could add it in a further version, but actually all the Semantic Web community is waiting for ISO or some other authoritative organism to publish an ontology of such properties ... Meanwhile you can forge your own vocabulary and map it or change it to authoritative URIs ... whenever those are published.

Does that help?

Bernard
Marc
I am reluctant to include streets in the main GeoNames database since it will blow up the database and I fear we would loose focus.  

Indeed we have to think about granularity and scalability. Don't forget GeoNames scope is worldwide. It currently stores 6,500,000 + objetcs.
Compare that with some roughfigures. Earth surface is about 510,000,000 km². Let alone oceans which represent 70% of this surface, continents represent roughly 150,000,000 km². That means with a mean distance between named features of 1 km, this is the number of features you would get. This is 25 times the current number. In populated areas, the density of names, if you want to drill down to the street-block-building level, can easily raise up to hundreds of features per km². So going down to this level means you eventually play in the billion(s) names. That's *A LOT*.
So yes, let's focus for now on quality rather than quantity. GeoNames should provide a general framework down to the kilometric scale as it is now : cities, populated places, and certainly important landmarks in big cities. Having at this level Tour Eiffel and Sydney's Opera makes sense.
But the new police office at Le Bois-d'Oingt, http://www.geonames.org/6545092/ ... well I'm not sure.
Of course what is a landmark or not is a big question with no definitive answer, but reasonable guidelines can be found.

The Semantic Web framework in which Geonames has started to play is providing the tools for specific extensions, with a richer granularity (regional or thematical) of this general data base.
My view on this in a nutshell

As soon as different applications/users want to use geonames data, they will ask for extension or modification of classification/taxonomy/types (whatever you want to call them). This can be endless, and end up with thousands of categories and counting. See dmoz, and what I wrote about it a few years ago at http://perso.orange.fr/universimmedia/nohi/enohip3.htm

So what we need in the long run is the ability for users to index/classify geonames features against a variety of concepts and concept schemes (this is how they are called in SKOS standard, which is used to express the feature types in the geonames ontology), and to extend those schemes in a collaborative way. Seems to me that it could be achieved through some kind of wiki-like interface, the tricky thing being to synchronize the geonames data base with this. This is an architecture to invent, but certainly the technical components are out there.
Gabi
I think that is better to use a 'official language' related to a area down the country (region, province... ). It can be more accurate that
use one language per country.  

Agreed, and when I propose that an official language is a language "endorsed by local authorities", local means local, so it can be at a country level, or regional level. In France, Corsican names are now used 'officially' in Corsica, so are Breton names in Brittany, along with the French ones. So e.g., http://www.geonames.org/3038334/ should have both Ajaccio (FR) and Aiacciu (CO) as official names. This is quite recent, though, in the 70's the local languages activists were tagging the official signposts in French with the Corsican or Breton names ...

marc wrote:

Maybe we should add a field 'languages spoken at this place' in the order of importance? The first language will be the 'official' one. What do you think?
 

I think the list of local languages is a good idea, but the notions of "order of importance" and "first language" are definitely bad ideas. Such notions might be very contentious in many places, where "native" language(s) have been fighting for ages with "official" languages, often imposed from above or abroad by history, colonization ... Very touchy subjects, and being aware that such conflicts exist, we are no one to declare "the winner is ...". So my 2 cents proposal is as follows :

1. A name is an "official name" when it is endorsed by local authorities. There should be one and only one such name in each local official language, and they should not be sorted by rank.
2. For other languages (non-official) we should have the notion of "preferred name" - at most one in each language, the rule in multilingual thesauri. For example "Rome" would the preferred name of "Roma" in French. In RDF "official name" would be a subproperty of "preferred name".
3. The "name" (current value of geonames:name in RDF) is actually a "display name". It is the result of a consensus, of which rules may vary and depend on a lot of parameters. It might or not be one of the official names. The fact that the language of this "display name" is not specified is not a bug, it's a feature, and should be kept. In the best of worlds, actually, we would not need to define that "display name" at all, it would be the preferred name in the preferred language of the user IHM, and based on content negotiation.
4. A true "alternate name" is any name which is not preferred. Its language should generally be specified, but with exceptions. Think of airport codes for airports, which bear no language.

OK - autosave indeed. Cool.
I'm OK with helping updating this file. But to save, you need to login. Is it the same login as the general geonames login?
Marc

Well done, once again. BTW the "Guillestre" search you give as example made me discover "Molandier" ... which has nothing to do with "Canton de Guillestre" to which it was rattached by mistake in Wikipedia. I just corrected that in Wikipedia, so I can know at which point the geonames full text search will be in sync. I guess the query is on your local dump of Wikipedia, not a runtime query on Wikipedia. Right?
A complement on that story. Romania government asked the change from "ROM" to "ROU" to avoid any (voluntary or not) confusion between Romanians and Gipsies, usually called "Rom" throughout Europe - or any of the more or less derogatory variants (in France you will hear e.g. "Romas", "Romanichels", "Manouches" etc ...)
This is a very sensitive cultural/linguistic issue, actually.
See http://en.wikipedia.org/wiki/Roma_people
Other suggestions :

Coordinates (0,0)

Mount Everest, North (or South) Pole, or any other extreme point. See http://en.wikipedia.org/wiki/Extreme_points_of_the_world

Also : United Nations Building, either in NY or Geneva
Thanks a lot Harry and Richard for review and remarks. See my answers on your respective blogs.
I'm happy to announce that the geonames ontology is now available where it belongs at http://www.geonames.org/ontology/
And congratulations to Marc who set up the matching RDF Web Service in no time.

NB : The test files at universimmedia are not available any more.
Unfortunately I don't speak those languages

And were it only for IE, I would gladly forget it

But, as written above, I find the issue also when dowloading admin1Codes.txt, and opening it with any text or XML editor at hand (UltraEdit, XML Spy ...) even when taking UTF-8 options. So... I don't know. We would need native speakers around for a variety of languages I'm afraid. What is the source of all those names, BTW?
Uniqueness of the feature id is indeed important, and to have a same feature displayed more than once on a map could be confusing. Of course, from a display viewpoint, having an extended feature displayed as a point is suboptimal, for streams as for countries or regions as well, but this is inherent to cartography. As for streams in particular, moving the pin a little (half a mile or so) upstream of its confluence rather than at the confluence itself makes certainly things less ambiguous.
Côte d'Azur is OK now but there is still a lot of other names to clean up, e.g at http://www.geonames.org/maps/showOnMap?q=Abū. There again everything OK in Firefox and a lot of things like Abū Z̧aby in IE (well, if you look at this thread in Firefox, you don't see the problem at all ... )
Follow-up of a remark I made on my region of Provence Alpes Côte d'Azur, which shows in the map interface as "Provence-Alpes-Côte dʼAzur" with a weird ʼ instead of ' in Internet Explorer. In Firefox I have a correct display.
Same through the web service
http://ws.geonames.org/countrySubdivision?lat=44.5&lng=6.5

So I downloaded the admin1Codes.txt file, and found out that it was not only a browser issue, since I found indeed :
FR.B8 Provence-Alpes-Côte dʼAzur in this file.
among many other occurrences of bad encoded characters in various languages. I checked in various text editors, and have the same issue, although the file seems to be recognized as UTF-8 encoded.

Do others have the same issue? Any clue on how to fix that? Is it something wrong in the files, or in my machine (could be, it's a new one and maybe some settings are to be fixed).
Hi Dan

I suppose marc will answer, I have had the same question in mind but my best guess is that administrative divisions are computed from the coordinates, and hence non editable (as the subdivisions themselves are). And of course, since they are calculated, I also guess that the limits of subdivisions are defined as polylines at best, so a certain level of (in)accuracy on the boarders is unevitable.
So it would be good indeed, given this inaccuracy on the boarders, if it could be corrected through the user interface.
See a consolidated proposal at http://perso.orange.fr/universimmedia/geo/geonames_ontology_v1.0.rdf
I have simplified the model : no more subclasses of features, the typing is made using properties "featureClass" and "featureCode", pointing to "Class" and "Code" respectively, which are themselves subclasses of "skos:ConceptScheme" and "skos:Concept" respectively. So all classes and codes are defined in a SKOS vocabulary. This seems closer to the actual data structure and "spirit" of geonames.
All feature codes are included in the ontology file.
I also updated examples accordingly at http://perso.orange.fr/universimmedia/geo/geonames_examples.rdf
Previous versions are not online anymore to avoid confusion.
Overnight got some new ideas about it, so it's version 0.1 already
http://perso.orange.fr/universimmedia/geo/geonames_v0.1.rdf
The vocabulary in v0 could be confusing vs geonames current terminology. So I replaced "Place" class, by "Feature", and "fCode" is an object property of "Feature" with range "FeatureCode".
Also replaced "isPartOf" by "parentFeature", but not sure it makes sense. What is the actual relationship now in geonames database between the two avatars of "PACA" region : the feature "Provence Alpes Côte d'Azur" identified by http://www.geonames.org/maps/geonameId=2985244
and the "Provence-Alpes-Côte dʼAzur" subdivision (with some encoding issue, BTW) retrieved by http://ws.geonames.org/countrySubdivision?lat=44&lng=6 ? Same issue with the country France and the feature French Republic.
I'm not sure there is any relationship at all between those. So the hierarchy of features France > PACA > Hautes-Alpes > Embrun I have put in the instances is maybe only a wishful thinking.
I've also got rid of wikipedia links for the moment, since there again the relationship with geonames features is unclear to me. Thought about including a "nearbyWikipedia" property with the default option ...
 
Profile for bernard -> Messages posted by bernard [30] Go to Page: 1, 2 Next 
Go to:   
Powered by JForum 2.1.5 © JForum Team