but there are lots of entries that show up. In particular, we have lots of historic houses here in Portsmouth NH, that have Wikipedia entries, and they appear to also have lat and lng. They show up fine in the Wikipedia layer of Google Maps.
Do you know what is going on, an if there's anyway for me to get all my local geocoded Wikipedia entries?
The GeoNames wikipedia layer has not been updated for quite some time. I assume those are entries that are new or did not yet have coordinates in a form understood by the parser when it was run the last time. (many months ago)
A new extract is planned, but I cannot make any promises on when it will be available.
Can it be a related problem that many of the descriptions (within the 'summary' tag) start in the middle of a sentence and don't match the start of the wikipedia article? Some of them can be quite confusing.
If what you say is true and the extracts are only done once every several months (really??) then maybe these descriptions will be automatically fixed on the next export? Or are there still known problems with the page parsing? Are you looking for help with the parsing?
I would have thought that such an extract would be trivial to do just by pushing a button once a week or even once a month. Is it so difficult / time-consuming to do that it can only be done twice or three times per year?
Parsing wikipedia is definitely not trivial. In fact it is nearly impossible. Wikipedia is not a structured data source, there are an infinite number of different templates how coordinates are used and the templates are changing constantly. This means when you have invested a lot of work in writing a parser, some weeks later when you run it again you miss tons of entries because people have changed the templates (using robots) for existing articles and you have to start again messing around with the parser.
So what does that mean, that it's a hopeless task?
Are you looking for help with the parsing? Is there anywhere where the parsing technique or code is available or documented? Do you work on the wiki source or the generated HTML? And is it really such an exhausting task to run as you make it sound?