Reverse Geo: Geonames.org - how are requests made to server from IM?

Started by Jingo, August 27, 2018, 04:57:53 PM

Previous topic - Next topic

Jingo

Hi.. quick question Mario... I tried reverse geocoding 2 groups of images... one that contained 86 and another that contained 749 or so images using the geonames.org service.  The 86 images worked just fine but when I selected the 749, it ran for a bit and then IM produced an info box stating my 2000 hourly credit was expired and none of the 749 images were done.  I was able to subsequently perform the reverse lookup on a single image ok.

So - 2 questions...

1 - does IM group the requests together after sending a bulk request and this is why none of the 749 images had the reverse geo data applied before hitting the limit?
2 - Just wondering why 835 images would cause me to go over the hourly 2000 credit limit... assume it has a lot to do with the definition of credits:  ".... the hourly limit is 2000 credits. A credit is a web service request hit for most services."  Is IM sending a web service request "per-metadata" and not "per image file"?

Thx Mario... 

Mario

Each image is processed individually because it may contain different coordinates. GN takes one coordinate pair per request and the reverse geo-coding dialog has no code to try to group images with "similar" or nearby coordinates together. Actually, it had but I have removed this for reasons I don't remember right now. But I think the problem was that this replaced individually set location metadata with the metadata of the last file processed with the same coordinates, which could wipe out data. It also caused issues with files having different sets of created/shown coordinates etc. I may need to look at this again, in light of the 2,000 credits per hour limit...

IMatch needs to call several GN endpoints to fetch all required data, e.g. address details or elevation data. The number of calls depends on the results from previous calls.

The 2000 limit per hour seems to be new for the unpaid free accounts on the GN side. We never have seen that error message before.

On their web site they say:

https://www.geonames.org/export/

30'000 credits daily limit per application (identified by the parameter 'username'), the hourly limit is 2000 credits.
A credit is a web service request hit for most services. An exception is thrown when the limit is exceeded.


If IMatch needs 2 or 3 requests per file to gather the required data, this limit is quickly reached. As I said, I don't recall that limit from before, must be new or just recently being enforced.
If you have to geo-code so many files, consider switching to a paid account.

Jingo

Quote from: Mario on August 27, 2018, 06:04:38 PM
IMatch needs to call several GN endpoints to fetch all required data, e.g. address details or elevation data. The number of calls depends on the results from previous calls.

The 2000 limit per hour seems to be new for the unpaid free accounts on the GN side.


I was afraid that might be the answer... seems these services are placing some major restrictions on getting map/geo/location data now... 2000 images per hour seems more than fair... 2000 token requests per hour is not good...  trying to process 100 images from a session and running out of requests seems a bit on the low side to me.  Since there is no "dashboard" to see how many requests you have left nor how many requests a reverse geo for a selection of images in IM will use leaves the userbase hoping it all works out.  Also, since the reverse geo fails if you are even 1 request over the limit for all images means we will need to make smaller image selections and hope that group works.

Thx for the info Mario....

Mario

The dialog currently caches all retrieved data and then updates all files in one transactions. Hence it either succeeds for all fails or fails for all files.
As I said, the 2,000 requests per hour limit seems to be new. IMatch needs at least 2 requests per file, sometimes 3, if one of the other requests does not yield an answer. This code has been written for IMatch 5 and worked over all the years. Maybe when I re-learn the GN API I can get away with two requests per file (there is no endpoint which retrieves all the data needed for the GPS location tags, at least none that I could find).

As I said above, I will see if I can re-enable the "same coordinates" optimization, even when dealing with files which have or don't have the same coordinate configuration. This should reduce the number of requests. If somebody files a feature request.

Do you get an error message when this happens?
I assume you use a free unpaid GN account?

Jingo

Quote from: Mario on August 27, 2018, 06:49:49 PM
The dialog currently caches all retrieved data and then updates all files in one transactions. Hence it either succeeds for all fails or fails for all files.
As I said, the 2,000 requests per hour limit seems to be new. IMatch needs at least 2 requests per file, sometimes 3, if one of the other requests does not yield an answer. This code has been written for IMatch 5 and worked over all the years. Maybe when I re-learn the GN API I can get away with two requests per file (there is no endpoint which retrieves all the data needed for the GPS location tags, at least none that I could find).

As I said above, I will see if I can re-enable the "same coordinates" optimization, even when dealing with files which have or don't have the same coordinate configuration. This should reduce the number of requests. If somebody files a feature request.

Do you get an error message when this happens?
I assume you use a free unpaid GN account?

Thx Mario - here is the screenshot after a failure:



Yes... free (subscribed) account... their pricing structure looks like it would cost ~$180 USD for a 10,000 credit/hr - 5,000,000 max per year contract... 

Mario

OK, at least IMatch is handling this properly.
I can try to reduce the number of calls somehow to get more mileage out of this limit.

I don't know about the cost of premium GeoNames accounts. But in general, there is no free lunch. Not with Google, Microsoft, GeoNames or one of the other services.
Users have now been 'trained' to like these services (by many years of free access) and now it's time for revenue.

Google reverse geo-coding (paid) has a limit of 80 requests per second and costs 5.00 USD per 1000 requests when you exceed your monthly free limit. Microsoft is similar.

The only "free" option for my users would be for me to setup by own reverse geo-coding service based on the free GeoNames.org data.
This would require me to learn how to do it and then to setup and maintain yet another server with MySql etc. And pay for it every month...

Jingo

Quote from: Mario on August 27, 2018, 07:33:39 PM
OK, at least IMatch is handling this properly.
I can try to reduce the number of calls somehow to get more mileage out of this limit.

I don't know about the cost of premium GeoNames accounts. But in general, there is no free lunch. Not with Google, Microsoft, GeoNames or one of the other services.
Users have now been 'trained' to like these services (by many years of free access) and now it's time for revenue.

Google reverse geo-coding (paid) has a limit of 80 requests per second and costs 5.00 USD per 1000 requests when you exceed your monthly free limit. Microsoft is similar.

The only "free" option for my users would be for me to setup by own reverse geo-coding service based on the free GeoNames.org data.
This would require me to learn how to do it and then to setup and maintain yet another server with MySql etc. And pay for it every month...

Thanks Mario.. anything you can do to reduce the call number will help... found this chart as well which provides the credit # info:



Thanks for your time.... Andy.


Mario

That's pretty steep.

IMatch uses findNearbyPlaceName to find the core data. Then astergdem to get elevation data (Altitude). Then findNearbyPostalCodes and/or findNearbyStreetsOSM to fill in data not delivered by findNearbyPlaceName. I could not find a single API entry which reliably delivers all the data needed to fill Alt, Country, State/Provice, City, Location. Maybe I need to revisit their APIs and see if there is any change. I doubt it. Google needs only one or two calls to fetch all the data that's needed. Their API is more efficient.

As I said, this was never a problem before. And GN is used in IMatch since version 5. 2000 credits per hour gives a user between 500 and 650 files which can be reverse geo-coded per hour with the free plan, and 7500 to 10000 files per day max. You can stretch this by reverse geo-coding one file and enabling the "nearby" option.

Jingo

Thx Mario.. guess, I'll switch over to Google and see how many I can get done via their service... I suppose one can always switch back and forth in the parameters as well to get a bit more millage out of things.  Thx again!


BTW: just to show the API efficiency difference between Google and Geonames... I was able to reverse geocode 633 images via Google in about 4 minutes... but Geonames provided an estimate of 22 minutes...  Guess it will be Google for me!

Mario

Google is faster and often has better data  - unless you need to geo-code files taken during hiking, biking or in remote areas.
When you do a "All" in the dialog, IMatch throttles calls to the services, waiting about 0.1 second after each file. After 10 files it waits for 0.5 seconds, to stay well below the maximum number of requests. This limits the number of files you can code per minute, even if the service responds in zero time (unlikely).

I've had a look at the code and IMatch need 3-4 calls to get all data from GN and 2 calls to get all data from Google.
In single-mode (when you lookup one file) IMatch retrieves 5 addresses which means 15 or 20 calls with GN and 10 with Google.

I have re-optmized the "All" case, reusing already retrieved data if adjacent files have the same coordinates (and also the same combination of created/dest coordinates). This reduces the number of API calls further when you process many files from the same address. The "nearby" option is now also utilized for "All" lookups, so if you do a "lookup all" with nearby and 10 files have the same coordinates, only 41 files need to be looked up.

I think I have improved this quite a bit (for both services).

Jingo

Quote from: Mario on August 28, 2018, 03:57:28 PM
Google is faster and often has better data  - unless you need to geo-code files taken during hiking, biking or in remote areas.
When you do a "All" in the dialog, IMatch throttles calls to the services, waiting about 0.1 second after each file. After 10 files it waits for 0.5 seconds, to stay well below the maximum number of requests. This limits the number of files you can code per minute, even if the service responds in zero time (unlikely).

I've had a look at the code and IMatch need 3-4 calls to get all data from GN and 2 calls to get all data from Google.
In single-mode (when you lookup one file) IMatch retrieves 5 addresses which means 15 or 20 calls with GN and 10 with Google.

I have re-optmized the "All" case, reusing already retrieved data if adjacent files have the same coordinates (and also the same combination of created/dest coordinates). This reduces the number of API calls further when you process many files from the same address. The "nearby" option is now also utilized for "All" lookups, so if you do a "lookup all" with nearby and 10 files have the same coordinates, only 41 files need to be looked up.

I think I have improved this quite a bit (for both services).

Thanks so much for taking the time to review this for us... I think the nearby option for ALL lookups will work wonders for my workflow since I tend to take a number of photos from similar locations - especially during a holiday or photo centric trip.

Appreciate the insight! - Andy.

Mario


sinus

Quote from: Mario on August 28, 2018, 03:57:28 PM
Google is faster and often has better data  - unless you need to geo-code files taken during hiking, biking or in remote areas.

Yes, I can only stress this.
Google is for me far better, with GeoNames I was even not able to add a street at the border of another city, it took always the wrong city! With google: no problem.
Best wishes from Switzerland! :-)
Markus

Mario

GeoNames.org is an open source project created and maintained by volunteers.
People like us. Not by a company like Google that makes billions of dollars per quarter by showing us advertisements everywhere and collecting and monetizing our data.

You can help GeoNames.org to have better data, correct wrong data or even add your home town if it is not already there.
http://www.geonames.org/manual.html

Projects like OpenStreetMap and GeoNames.org try to create open source systems and public domain content for mapping and geocoding.
Without these efforts, all this data would solely be in the hand of big corporations like Google or Microsoft - and their only goal is to make money.
Which is not bad per per sé, but without alternatives they would take full control over this data and that's never a good thing. Then the only way to get mapping or reverse geocoding would be to pay Google or Microsoft, forever.

jch2103

Quote from: Mario on August 31, 2018, 06:29:09 PM
You can help GeoNames.org to have better data, correct wrong data or even add your home town if it is not already there.
http://www.geonames.org/manual.html

Projects like OpenStreetMap and GeoNames.org try to create open source systems and public domain content for mapping and geocoding.

As an occasional contributor to OpenStreetMap (www.openstreetmap.org), I'd urge everyone to check out data from both groups and help them fill in any gaps in coverage. Note that OpenStreetMap includes specialized data sets such as bike paths and hiking trails.
John