IMatch 2025 AutoTagger Teaser

Started by Mario, January 25, 2025, 10:32:35 AM

Previous topic - Next topic

Mario

While working with IMatch every day in order to find all remaining bugs and "issues", I'm also updating the help system constantly. Re-reading the AutoTagger help topic (adding keywords and descriptions with AI), I've decided to add some extra screen shots right at the beginning, to show users what to expect from AutoTagger and current AI technology.

I thought it would be nice to post the screen shots here, so users see what soon will be available to them.

I've used both the Mistral AI and Open AI for this. I've let them produce a description and a set of keywords, without any additional editing or keyword manipulation.

My prompts where:

Description: Describe this image in the style of a news caption. Use factual language.
Keywords: Return ten to fifteen keywords describing this image.


app_25155.jpg

To get a shorter description, I've changed the prompt to

Description: Return a short headline for this image.

app_25156.jpg

And, since the Mistral Model was trained on many European languages, I've changed the prompt to

Description: Return a short headline in German for this image.

app_25157.jpg

All of this with the basic AutoTagger settings. IMatch 2025's ability to use custom prompts, even prompts containing IMatch variables, makes the AI integration so much more powerful and convenient.

For example, if your images already have location data, helping the AI with a prompt like

This image was taken in {File.MD.city}. Return a short headline for this image.

can improve the returned description a lot. Because the AI now receives prompts like:

This image was taken in New York. Return a short headline for this image.

and it will consider the city when figuring out the best possible description for the image.

mopperle

Thanks for those examples. Very helpfull.
As I had a discussion in another forum regarding "Autotagging", many people (including me in the past ;) ) still think that those tools can derive a location (city etc.) from the GPS data in  the picture. Maybe its worth to mention in the IMatch Help, that this is not possible.

Mario

Quote from: mopperle on January 25, 2025, 12:18:57 PMAs I had a discussion in another forum regarding "Autotagging", many people (including me in the past ;) ) still think that those tools can derive a location (city etc.) from the GPS data in  the picture

That's what reverse geocoding is for.
IMatch already supports Google, HERE, Bing and OpenStreetMao for this purpose.
These are highly-specialized database systems, build for exactly that purpose. And they work really well.

The image the AI gets to see usually has no metadata, because IMatch uses the thumbnail or a small 300 to 500 pixel cache image created on-the-fly in memory. This is for privacy reasons and because LLMs don't handle metadata at all.

If the image shows a well-known building or tourist spot, and you ask for it, the cloud-based AIs usually return the place name and even some trivia. Because they have hovered up millions of images taken at these locations. See the examples above for the Gherkin in London or the Reichstag building in Berlin. The AIs figured that out without GPS coordinates. Or the borough in London based on the street name on the street sign!

Providing an AI model GPS coordinates via the prompt (easy in AutoTagger) either returns nothing or a hallucination in the form of a non-existing or completely wrong place. Unreliable, thus useless.

But things in AI change fast. Modern models allow for "hook", which can be constructed by the "calling" software, e.g. to determine a location from GPS coordinates in the prompt by looking them up in OpenStreetNames or Google maps, and then incorporating the result in the returned reply. Technically more complex and more expensive. But great for special uses in corporate environments, like allowing the AI to lookup stuff in a company data-warehouse.

Maybe some day Google extends their Gemini AI or Microsoft extents Copilot with access to their respective map databases, enabling users to fetch location information by including GPS coordinates in the prompt. Who knows. I keep an eye on all that, and when there is demand, I can add Gemini and Copilot to AutoTagger later. IMatch 2025 starts with support for 5 different AI models, paid and free, which is plenty.


QuoteMaybe its worth to mention in the IMatch Help, that this is not possible. 

I don't think so. Listing everything the current AIs cannot do would be a list too long to handle ;)

Every new model generation that is made available for public consumption is better than the ones before. The latest non-public (aka fee-based) models can access the Internet to query stuff. Let's see what's possible in a year.