GeoNames Home | Postal Codes | Download / Webservice | About 

GeoNames Forum
  [Search] Search   [Recent Topics] Recent Topics   [Groups] Back to home page 
[Register] Register / 
[Login] Login 
How to process the GeoNames RDF file?  XML
Forum Index -> General
Author Message
wouter



Joined: 25/03/2020 16:54:32
Messages: 1
Offline

I tried to use the GeoNames RDF file, but it does not seem to be valid RDF.

This can be tested with the following command:

```
$ curl 'http://download.geonames.org/all-geonames-rdf.zip' | gunzip | head
```

This shows that the GeoNames RDF file contains snippets of RDF/XML interspersed with loose URLs (see below). Since this is a non-standard format, I assume that there is a common procedure or script to transform this file into a valid RDF file.

```
https://sws.geonames.org/3/
<?xml version="1.0" encoding="UTF-8" standalone="no"?><rdf:RDF xmlns:cc="http://creativecommons.org/ns#" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:gn="http://www.geonames.org/ontology#" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:wgs84_pos="http://www.w3.org/2003/01/geo/wgs84_pos#"> <gn:Feature rdf:about="https://sws.geonames.org/3/"> <rdfs:isDefinedBy rdf:resource="https://sws.geonames.org/3/about.rdf"/> <gn:name>Zamīn Sūkhteh</gn:name> <gn:alternateName xml:lang="fa">زمين سوخته</gn:alternateName> <gn:alternateName xml:lang="fa">Zamīn Sūkhteh</gn:alternateName> <gn:featureClass rdf:resource="https://www.geonames.org/ontology#S"/> <gn:featureCode rdf:resource="https://www.geonames.org/ontology#S.CRRL"/> <gn:countryCode>IR</gn:countryCode> <wgs84_pos:lat>32.45831</wgs84_pos:lat> <wgs84_pos:long>48.96335</wgs84_pos:long> <gn:parentFeature rdf:resource="https://sws.geonames.org/3202991/"/> <gn:parentCountry rdf:resource="https://sws.geonames.org/130758/"/> <gn:parentADM1 rdf:resource="https://sws.geonames.org/127082/"/> <gn:nearbyFeatures rdf:resource="https://sws.geonames.org/3/nearby.rdf"/> <gn:locationMap rdf:resource="https://www.geonames.org/3/zamin-sukhteh.html"/> </gn:Feature></rdf:RDF>
https://sws.geonames.org/4/
<?xml version="1.0" encoding="UTF-8" standalone="no"?><rdf:RDF xmlns:cc="http://creativecommons.org/ns#" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:gn="http://www.geonames.org/ontology#" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:wgs84_pos="http://www.w3.org/2003/01/geo/wgs84_pos#"> <gn:Feature rdf:about="https://sws.geonames.org/4/"> <rdfs:isDefinedBy rdf:resource="https://sws.geonames.org/4/about.rdf"/> <gn:name>Rūdkhāneh-ye Āb-e Zālek</gn:name> <gn:alternateName xml:lang="fa">رودخانه آب زالک</gn:alternateName> <gn:alternateName xml:lang="fa">Āb-e Zālakī</gn:alternateName> <gn:alternateName>Rūdkhāneh-ye Āb-e Zālek</gn:alternateName> <gn:alternateName xml:lang="fa">Rūdkhāneh-ye Zākalī</gn:alternateName> <gn:alternateName>Rūdkhāneh-ye Āb-e Zālekī</gn:alternateName> <gn:alternateName xml:lang="fa">رودخانه زاکلی</gn:alternateName> <gn:alternateName xml:lang="fa">رودخانه آب زالکی</gn:alternateName> <gn:featureClass rdf:resource="https://www.geonames.org/ontology#H"/> <gn:featureCode rdf:resource="https://www.geonames.org/ontology#H.STM"/> <gn:countryCode>IR</gn:countryCode> <wgs84_pos:lat>32.93273</wgs84_pos:lat> <wgs84_pos:long>48.76505</wgs84_pos:long> <gn:parentFeature rdf:resource="https://sws.geonames.org/127082/"/> <gn:parentCountry rdf:resource="https://sws.geonames.org/130758/"/> <gn:parentADM1 rdf:resource="https://sws.geonames.org/127082/"/> <gn:nearbyFeatures rdf:resource="https://sws.geonames.org/4/nearby.rdf"/> <gn:locationMap rdf:resource="https://www.geonames.org/4/rudkhaneh-ye-ab-e-zalek.html"/> </gn:Feature></rdf:RDF>
```
zcw100



Joined: 28/09/2019 15:28:16
Messages: 17
Offline

It's a somewhat strange format. It's a url followed by rdf/xml for that url. I put together a short bash script to output it to a single n-triples file. That I'll post when I find it. It takes a while, maybe a day, and results in about a 600mb file compressed.

I've asked for the mappings used to generate the rdf several times but for some inexplicable reason they won't share it.
zcw100



Joined: 28/09/2019 15:28:16
Messages: 17
Offline

Here's that script. You'll need the raptor2 library for the rapper parser.

#!/bin/bash

while read file; do
rapper --quiet <(echo $file) 2> errors
done < <(awk ‘NR % 2 == 0 { print; }’ $*)
 
Forum Index -> General
Go to:   
Powered by JForum 2.1.5 © JForum Team