Share Examples?

Started by Mario, March 10, 2025, 11:09:12 AM

Previous topic - Next topic

Mario

I would be great to see some examples of results you produced with AutoTagger, including

  • a small version of the image
  • the AI used
  • your prompt
  • your results (keywords, description, landmarks, traits)

Most people, me included, learn best from examples, and if you are satisfied with your AutoTagger results, let others know how you did it.
Note: Please do not upload images of persons, or blur their faces before uploading.

Jingo

Great idea!

Here is mine: 
OM-1_03-01-P3015117.jpg

OpenAI Prompts:

Description prompt:

[[-c-]] Describe this image in a style to make it easily searchable. Use simple English, common words, factual language, and simple sentences.
Avoid describing anything not directly observable from the image.
{File.MD.city|hasvalue:This image was taken in {File.MD.city},}{File.MD.country|hasvalue: {File.MD.country}.}

Keyword prompt:

[[-c-]] Return ten to fifteen keywords describing this image.
{File.MD.hierarchicalkeywords|hasvalue:This image has the following keywords: {File.MD.hierarchicalkeywords}.}


Results:

AI Description:
This image shows a white-throated sparrow on the ground among leaves. The bird has brown and gray feathers, with a distinctive striped head. It is in a natural outdoor setting, surrounded by fallen leaves. The sparrow appears to be foraging for food.

AI Keywords:
AI|American Sparrow; AI|Brown Feathers; AI|Foraging; AI|Ground; AI|Songbird; AI|Striped Head; Animals|Bird; Animals|Bird|White-throated Sparrow; colorful; Habitat; Objects|Leaves; outdoors; Places|Nature; small bird; wildlife

I haven't yet set up 1:1 keywords mapping for some of the returned hierarchical keywords.. but will be doing that next.

Stenis

Thanks for your input Jingo and god to have an explicit headline for the prompt examples Mario. I´ll contribute a little later.

One thing I didn´t understand is this:

{File.MD.city|hasvalue:This image was taken in {File.MD.city},}{File.MD.country|hasvalue: {File.MD.country}.}

Were there no data in these "Location"-elements of yours?


Mario

Quote from: Stenis on March 10, 2025, 03:08:00 PMOne thing I didn´t understand is this:

{File.MD.city|hasvalue:This image was taken in {File.MD.city},}{File.MD.country|hasvalue: {File.MD.country}.}

Were there no data in these "Location"-elements of yours?
This variable checks if the city tag has a value. If yes, it outputs something like "This image was taken in London, United Kingdom". If there is no city value, the variable outputs nothing.

Not all files in a database may have location data when the prompt runs. The hasvalue check ensures that the phrase is only added to the prompt when there is data.

Stenis

#4
Thanks for your reply Mario.

Below you can see my basic "static configuration" I have landed in for now and a few F7 prompt examples



I use all three static prompts the "Description", "Keyword" and "Landmark" now and never changes that data.
Landmarks are written (if they are found) to merge with the data in "Description" with a variable.
I always use flat keywords
I even think the "Landmark" works surprisingly well :-)

One other important condition for my workflow is that I still use PhotoMechanic to batch update about 20 data-elements in iMatch.
For me it is much faster than doing it in iMatch.
AutoTagger updates the "Description" and "Keyword"-elements with AI.
Compared to my earlier process, this is extremely much faster and fare more efficient due to I don´t have to write descriptions manually. It is like a stone has fallen from my shoulders and work has got so much easier.

I just have to read the ready-made texts now and adjust a few things sometimes but most of the time it is much more than good enough as it is
I´m also very happy with the job OpenAI is doing there in the background.
What a change this has brought to my processes!
It feels like getting a whole new photo archiving life!

Fantastic job you have done Mario!
I was afraid the system should be Beta and not flexible enough but it really is very well engineered and solid already.
Frankly, I did not really expect that I think and am glad I´m proven wrong.



At Description prompt (static)

[[-c-]]
Always priority for gepgrafic data, the year and peoples names.
Always write text inside quotes but without the quote signs.
Describe this image in a style to make it easily searchable.
Use simple English, common words, factual language, and simple sentences.
Avoid describing anything not directly observable from the image.
Write the geografic data and year in the beginning of the text together with names



At Keyword prompt (static) (Presumption: Only non-hierarchic keywords)

[[-c-]]
Keywords never in plural form
Return five to seven keywords describing this image.
Use simple English, common words, factual language.
Max one word in each keyword.
No geografic data or time info as keywords.
No words with only capital letters.


At Landmark prompt (static) (Presumption: Merge with existing target (Description) values

[[-c-]] Return the names of known landmarks and tourist spots in this image to  {File.MD.description}
If you cannot detect any landmarks or tourist spots, return ''.



At the F7-prompt (non-static)

Template

Keywords never in plural form
Always write text inside quotes but without the quote signs.

Mandatory text: "Place Country Year" - "motif"
In foreground "Name"


Examples


1.

Keywords never in plural form
Always write text inside quotes but without the quote signs.

Mandatory text: Essaouria Morocco 2025 -
"Essaouria Fishing Harbor"

2.

Keywords never in plural form
Always write text inside quotes but without the quote signs.

Mandatory text: Eze Sur Mer Provence France 2012 -
"Bye the sea"
Ingrid Sändh in the foreground



Mario

To attach files, use the Attach command available bellow the full post editor. If you use Quick Reply instead, click "Preview" once to see all controls for adding attachments.
Copying and pasting images from somewhere does not work.

Stenis

#6
Using Open AI GPT 40 instead of GPT-40-Mini with Autotagger is a completely different thing!!!

Earlier I complained about my pretty poor results with using Open AI mini or Gemma 3 4 on the task of figure out which species there were in the pictures. They both struggled with deciding which antelope or birds it was in the pictures. Open AI got a bit better after tweaking the prompts and adding the names of the animals, then they fixed even to fetch the latin names often but then the whole idea of speeding up these workflows got jeopardised I thought.

Then I saw I also had the option of using GPT 40 and I did and my jaw dropped. It was almost 100% adding name of the species, the latin name and the families the animals belonged to and that was the case both for mammals, birds and reptiles.

Topi.png

This was my prompt texts:

Mandatory: Write the specie and the name in latin plus the family of species they belong to into Description.
Write the specie and family name into Keywords.


These were the results:

Description: A group of Damaliscus lunatus, commonly known as topi, graze in the savanna. The topi, part of the Bovidae family, is characterized by its reddish-brown coat and distinctive dark markings. The scene features scattered trees and open grassland, typical of the African savanna.

Keywords: Antelope, Grass, Savanna, Topi, Tree, Wildlife


dikdik.png


Here another example with the same prompt av the first example above:
This picture of the really small Dik-Dik antelope neither Gemma 3 4 or GPT-40-mini managed to figure out.

This is absolutely better than I expected. So, when I will tag all my safari pictures I will definitely use the larger OpenAI model. I have quite a few since I have been to East - and South-Africa totally eight times - the first trip already in 1972 - and I also have safari pictures both from Sri Lanka and India too to take care of. So, OpenAI GPT 40 will help a lot. If it solves these problems as effective as I have seen now, I don´t really care about the cost because what I win is so much time and efforts that I don´t really have in my age. From what I have seen it is hallucinating less than the other two smaller models.

I've really struggled to get a handle on all of this before and no matter how effective PhotoMechanic has been even in my hands, there has always been two data-fields/elements that have remained: Descriptions and Keywords On the other hand a picture library just gets better and more useful the more metadata you manage to add on the pictures, that is the positive and motivating side of it.

There is a new model of OpenAI coming soon (version 4.5 I think) that might be even better. There is a real race now it seems. Are we getting access to that one too in the future?

Stenis

kostnad openAI.png

Den större modellen kostar ca 16 ggr mer att använda vad det verkar

Stenis

#8
I wonder if there is beginning to build a new trend where software companies are trying to push us a lot more than before. Look at this example below.

On top of all other sharpening models in Topaz AI, Topaz have now created a Super Focus-function that processes our pictures in the cloud and using it seems to cost a lot.

If I look in the list below: 
To use Super Focus V2 for one of my 33 Mp pictures costs 14 points per picture and their rekommendation to buy is 400 points for 39.99 U$. For that cost per picture 400 U$ will cover 28 pictures. I tested it and the results on my pictures are better with the non-cloud models luckily enough but the trend is clear I think that we will continue to see attempts of these companies to charge us even for processing time in the cloud now.

Compared to that the AI-companies like OpenAI not just gives us a number of options to chose between but they are have options that are very reasonably priced.


Topaz Sharpen cloud.png

Mario

Your point?

"Better" models cost more money. You rent "compute" and when the model is more expensive to compute, you pay more.

The new beta Google Gemini 2.5 model costs 10$ per million output tokens, while the Gemini 2.0 Flash (Light) costs 0.3$, making the 2.5 model 33 times more expensive.

If the free Gemma 3 model does not do what you need, AutoTagger offers OpenAI with two models (cheaper and more expensive), Mistral (two models, cheaper and more expensive) and soon Google Gemini (cheaper and more expensive models).
Thankfully you have IMatch which gives you many options and the freedom of choice.

Stenis

#10
Yes, you have done a wonderful job Mario, giving iMatch all that flexibility. It is really good we have the possibility to "step up" if it is necessary and my examples at least for me shows very clearly that exactly it sometimes is necessary, at least if you value efficiency and productivity more than the absolute cheapest solution.

On the other hand it has rarely been the most technically advanced softwares or processes that "wins" historically seen. It is mostly "the good enough" that takes home the "victory" and the money.

For me my small example with the African animals makes it very clear that 16 times higher cost will be worth it in that particular case of several reasons and maybe for others as well. Taking on many thousands of safari pictures with the more expensive model might be what it takes to get that job done. With cheaper alternatives it might be so cumbersome that it will never be done or finished.

With the mini-model of OpenAI I might need to spend much more time adjusting my prompts and rely on trial-and-error workflow instead of just letting GPT 40 doing all of the job with one single prompt for it all. There is a Swedish expression called "Dumsnålhet" (I'm sure you have one in German too) with means that I might embrace a combination of stinginess and stupidity because I´m just being obsessed with the money I pay for the service.

It could have been the same with th Topaz example but those similarities ends there today because that cloud service just isn´t good enough yet - at least for me. With the tests I have done it still doesn´t deliver because it mostly gives terrible artifacts but say you have a "once in a lifetime shot" and it would have delivered and managed to turn a technically mediocre image to something close to fantastic shot for in that case 70 cents? I could have been the market for that because I do have some historical images that I know are unique taken at the Saur Revolution when it all started in Afghanistan leading to an over 40 year long period of war.

Afghanistan1.png

Stenis

#11
Sorry, There was no more space for pictures above.

Afhanistan 2.png

This is around noon one of the las days in april 1978 when I took that picture. 30 minutes later I had taken my pictures and sat in Khyber Restarant that lays around the same gigantic roundabout, waiting for my lunch. First they started to shoot with Kalashnikovs and after maybe ten minutes they fired the first canon-shot with one of the T55 tanks parked just 10-15 meters from the restaurant. It was parked with the rear against the restaurant entrance and the shock wave was enormous. If it wouldn´t have been for the decmeter broad aluminum blinders in the windows we might have been blinded all by the glass that flew into the restaurant. The rest is history.

I have quite a few color slides that might need something more powerful than the local sharpening models in Topaz and I know that a lot of other older photographers have the same problems.

Stenis

I will test how well Mistral performs too. You never know what EU will run in to with to US and how that will affect the American AI companies.

I rather buy a german DAM like iMatch than PhotoMechanic especially when it is superior to PM anyway the way I use them. I ditched Lightroom many years ago becuase it was inferior when it comes to image quality to DXO Optics Pro / DXO Photolab and nothing else can compete with danish Capture One when it comes to tethering. ... and I rather use an european AI-system if it performs decently befor an American or Chinese one like DeepSeek if that would have been a real option.

So what is the opinion of Mistral Mario?

Mario

Quote from: Stenis on April 08, 2025, 07:02:36 PMSo what is the opinion of Mistral Mario?
Give it a try. I have written what there is to know in the help: Mistral AI

Stenis

I will, thanks! ... even if OpenAI GPT-40 really impressed me when it comes to identifying animals and write useful descriptions with a minimum of prompting efforts


Stenis

I processed a lot of different animals yesterday with OpenAI GPT-40 which is the bigger model and it only missed once with a Serval cat half hidden in the tall savannah grass that GPT-40 mistook for a Cheetah.

I have just moved this from the Ad Hoc prompt (F7) into the Description prompt:
 
Write the animal species names and the animal name in latin plus the family name of the species they belong to into Description.
Write the animal species names and families names into Keywords.

With that I almost never need to do anything else than to press Run.
It is very very effective.