Hi, I found the solution. I post it hoping it could help someone.
As mentioned by Marc a while ago (sorry I can;t find the link again):
in alternameNames.txt, there is the list of alternate names for everything in all language.
So, for each country, I get the geonameid, and I grep the associated lines in the alternateNames.txt file.
I focused on language of the country. so the python code end like this:
Code:
import string, subprocess,pymongo
dbuniverse = pymongo.MongoClient()['UNIVERSE']
colcountry = dbuniverse['countries']
altcountryfile = '/Users/colin/Downloads/alternateNames/alternateNames.txt'
cursor = colcountry.find({'ISO':'BE'})
for doc in cursor:
cn=doc['Country']
id=doc['geonameid']
langs=doc['Languages']
print cn,langs
for l in langs.split(','):
ll=l.split('-')[0]
togrep="%s\t%s"%(id,ll)
p= subprocess.Popen(['grep',togrep,altcountryfile], stdout=subprocess.PIPE)
out =p.stdout.readlines()
for r in out:
data = r.split('\t')
alt_n=data[3]
print alt_n
cursor = colcountry.update({'geonameid':id},{'$addToSet':{'altNames':alt_n}})
GeoNames is really great.