| Author |
Message |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 14/09/2010 10:41:56
|
Tal Weiss
Joined: 14/09/2010 10:34:02
Messages: 1
Offline
|
Hey all,
I searched the daily dump for hotels.com data and only found 20K hotels.
(
I searched for hotels ('HTL') and used the RDF web service to see if it contains a hotels.com URL.
)
I am an affiliate of hotels.com and I know they have almost 100K hotels.
Why are (most) hotels missing the URL?
Are there any plans to refresh the data?
How can we (the users) get notifications when the data is refreshed? What is the date of the last bulk upload?
Thanks for this great service!
Tal.
|
|
|
 |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 15/09/2010 09:34:10
|
marc
Joined: 08/12/2005 07:39:47
Messages: 4501
Offline
|
I count 63k ian hotel urls in the geonames database.
There are no plans to refresh the hotel data. It is a lot of effort to eliminate duplicates and match hotels. If users of the premium dump show interest in this then it could be done, otherwise I don't think it merits top priority:
http://geonames.wordpress.com/2010/09/13/premium-dump/
Cheers
Marc
|
 |
|
|
 |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 16/09/2010 21:17:34
|
talweiss
Joined: 16/09/2010 21:08:58
Messages: 1
Offline
|
[I had to register a new user because for some reason I can't log in nor reset my password...]
Thanks for the fast reply Marc,
Can you please help me see these hotels?
I used the daily allCountries.zip dump.
I filtered only the locations of type "HTL".
Sadly, now I had to use a web service and I looked at http://sws.geonames.org/{}/about.rdf
I scanned the reply for 'travel.ian.com'.
* How many HTLs are there in the daily dump? I see 90K.
* Are there more (hotels.com) hotels listed as a different type (e.g. a resort? A Spa? Something else entirely?)
* Is is possible that the RDF result of a hotels.com hotel would not contain 'travel.ian.com'?
* Can I download the hotels.com URLs from somewhere in the daily dump and not burden your web service?
Thanks!
Tal.
Tal.
|
|
|
 |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 17/09/2010 08:42:14
|
marc
Joined: 08/12/2005 07:39:47
Messages: 4501
Offline
|
there is a full rdf dump: http://www.geonames.org/ontology/
The booking urls are rather offtopic and I don't want to spend time supporting it.
Marc
|
 |
|
|
 |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 26/09/2010 17:29:04
|
major.tal
Joined: 26/09/2010 17:25:03
Messages: 18
Offline
|
RDF dump does NOT contain any booking URLs!
I downloaded the (huge) file, unzipped all 6.5 GBs, but the URLs are missing!
This is an entry from the file:
<?xml version="1.0" encoding="UTF-8" standalone="no"?><rdf:RDF xmlns="http://www.geonames.org/ontology#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:wgs84_pos="http://www.w3.org/2003/01/geo/wgs84_pos#"><Feature rdf:about="http://sws.geonames.org/6533005/"><name>Marriott Exec Apts Dubai Creek</name><featureClass rdf:resource="http://www.geonames.org/ontology#S"/><featureCode rdf:resource="http://www.geonames.org/ontology#S.HTL"/><inCountry rdf:resource="http://www.geonames.org/countries/#AE"/><wgs84_pos:lat>25.25</wgs84_pos:lat><wgs84_pos:long>55.2666</wgs84_pos:long><parentFeature rdf:resource="http://sws.geonames.org/292224/"/><nearbyFeatures rdf:resource="http://sws.geonames.org/6533005/nearby.rdf"/><locationMap rdf:resource="http://www.geonames.org/6533005/marriott-exec-apts-dubai-creek.html"/></Feature></rdf:RDF>
and this is the entry for the same entity from the web service:
<?xml version="1.0" encoding="UTF-8" standalone="no"?><rdf:RDF xmlns:cc="http://creativecommons.org/ns#" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:gn="http://www.geonames.org/ontology#" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:skos="http://www.w3.org/2004/02/skos/core#" xmlns:wgs84_pos="http://www.w3.org/2003/01/geo/wgs84_pos#"><gn:Feature rdf:about="http://sws.geonames.org/6533005/"><rdfs:isDefinedBy>http://sws.geonames.org/6533005/about.rdf</rdfs:isDefinedBy><rdfs:label>Marriott Exec Apts Dubai Creek</rdfs:label><gn:featureClass rdf:resource="http://www.geonames.org/ontology#S"/><gn:featureCode rdf:resource="http://www.geonames.org/ontology#S.HTL"/><gn:countryCode>AE</gn:countryCode><wgs84_pos:lat>25.25</wgs84_pos:lat><wgs84_pos:long>55.2666</wgs84_pos:long><gn:parentCountry rdf:resource="http://sws.geonames.org/290557/"/><gn:parentADM1 rdf:resource="http://sws.geonames.org/292224/"/><gn:nearbyFeatures rdf:resource="http://sws.geonames.org/6533005/nearby.rdf"/><locationMap rdf:resource="http://www.geonames.org/6533005/marriott-exec-apts-dubai-creek.html"/><foaf:page rdf:resource="http://travel.ian.com/index.jsp?pageName=hotInfo&cid=185530&locale=en_UK&hotel=1&hotelID=203576"/></gn:Feature><foaf ocument rdf:about="http://sws.geonames.org/6533005/about.rdf"><foaf:primaryTopic rdf:about="http://sws.geonames.org/6533005/"/><cc:license rdf:resource="http://creativecommons.org/licenses/by/3.0/"/><cc:attributionURL rdf:resource="http://sws.geonames.org/6533005/"/><cc:attributionName rdf:datatype="http://www.w3.org/2001/XMLSchema#string">GeoNames</cc:attributionName><dcterms:created rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2007-04-25</dcterms:created><dcterms:modified rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2007-04-14</dcterms:modified></foaf ocument></rdf:RDF>
Please help!
Thanks,
Tal.
|
|
|
 |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 29/09/2010 07:56:57
|
major.tal
Joined: 26/09/2010 17:25:03
Messages: 18
Offline
|
please answer?
RDF dump does NOT contain any booking URLs!
|
|
|
 |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 03/10/2010 14:53:27
|
major.tal
Joined: 26/09/2010 17:25:03
Messages: 18
Offline
|
Marc,
Can you please reply even if you are not planning to fix the faulty RDF dump so I can make progress in my project?
Thanks,
Tal.
|
|
|
 |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 04/10/2010 10:22:19
|
marc
Joined: 08/12/2005 07:39:47
Messages: 4501
Offline
|
I have already answered that they are offtopic and I don't want do spend any time debugging this. I will remove the links from the web service to make it consistent with the dump.
Updating the hotels.com data would have far higher priority, but it requires a lot of work to clean the addresses and avoid duplicates. I don't have time and I don't have the impression you are willing to help and contribute anything.
Marc
|
 |
|
|
 |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 05/10/2010 00:12:55
|
major.tal
Joined: 26/09/2010 17:25:03
Messages: 18
Offline
|
Hi Marc,
I'm sorry you got that impression, but it is quite wrong. I'm 100% willing to do the renewal, cleaning, matching and merging of these ~100K locations (there are a lot of duplicates!). In fact, I have already started this task.
Please let me know how we can communicate directly.
I'm sure I can contribute with many more issues.
My coordinates are here: http://www.google.com/profiles/major.tal#contact
...and please don't erase the data from the web service.
Thanks,
Tal.
|
|
|
 |
![[Post New]](/gforum/templates/default/images/icon_minipost_new.gif) 13/10/2010 08:22:09
|
marc
Joined: 08/12/2005 07:39:47
Messages: 4501
Offline
|
thanks for offering to help. I will prepare an export with addresses and ids so that you can do the cleanup and merge.
Best
Marc
|
 |
|
|
 |
|
|