AutoTagger Prompt Engineering

Started by monochrome, February 06, 2025, 11:31:51 PM

Previous topic - Next topic

monochrome

I've run about 10k images through the AutoTagger and while the results are good, there are three things that I think makes it less useful than it can be and this thread is an attempt at fixing that by developing good prompts for the AI. (Note - I'm using OpenAI)
  • It hallucinates. The AI makes up context that isn't there. For example, it assumes that any adult near any child is the child's parent. Sometimes right, mostly wrong. Also, it has an inner art critic.
  • Not very searchable language. I want "flowers" not "floral composition", so I can grunt something simple into the search bar and get results.
  • Using image metadata (time taken, GPS, people tagged) in an intelligent way.
I'll start off, trying for something for points 1 and 2, changes from default in bold:

Quote[[-c-]] Describe this image in a style to make it easily searchable. Use simple English, common words, factual language, and simple sentences. Avoid describing anything not directly observable from the image.

This turned:

QuoteA mother and her two young daughters are seen outdoors, kneeling beside a flower arrangement in a residential area. The children, dressed in bright, colorful clothing, appear engaged with the plants as they explore their surroundings.

Where the family relations are completely fabricated and makes up what the children are doing, into:

QuoteThis image shows a woman and two young children near a flower cart. They are looking at colorful flowers. The setting is a sidewalk with a building in the background. The area appears peaceful and has some greenery.

And:
QuoteA close-up view of a still life arrangement featuring a variety of colorful flowers intertwined with clusters of dark blue berries, showcasing the beauty of nature's bounty.

Into:
QuoteThis image shows a close-up of various flowers and fruits. There are purple blueberries and clusters of small, vibrant flowers. The flowers include a yellow flower, a pink flower, and a white flower with a yellow center. The background is green, indicating a natural setting.

Mario

#1
Very good, thanks for sharing.

Stenis

Quote from: monochrome on February 06, 2025, 11:31:51 PMI've run about 10k images through the AutoTagger and while the results are good, there are three things that I think makes it less useful than it can be and this thread is an attempt at fixing that by developing good prompts for the AI. (Note - I'm using OpenAI)
  • It hallucinates. The AI makes up context that isn't there. For example, it assumes that any adult near any child is the child's parent. Sometimes right, mostly wrong. Also, it has an inner art critic.
  • Not very searchable language. I want "flowers" not "floral composition", so I can grunt something simple into the search bar and get results.
  • Using image metadata (time taken, GPS, people tagged) in an intelligent way.
I'll start off, trying for something for points 1 and 2, changes from default in bold:

Quote[[-c-]] Describe this image in a style to make it easily searchable. Use simple English, common words, factual language, and simple sentences. Avoid describing anything not directly observable from the image.

This turned:

QuoteA mother and her two young daughters are seen outdoors, kneeling beside a flower arrangement in a residential area. The children, dressed in bright, colorful clothing, appear engaged with the plants as they explore their surroundings.

Where the family relations are completely fabricated and makes up what the children are doing, into:

QuoteThis image shows a woman and two young children near a flower cart. They are looking at colorful flowers. The setting is a sidewalk with a building in the background. The area appears peaceful and has some greenery.

And:
QuoteA close-up view of a still life arrangement featuring a variety of colorful flowers intertwined with clusters of dark blue berries, showcasing the beauty of nature's bounty.

Into:
QuoteThis image shows a close-up of various flowers and fruits. There are purple blueberries and clusters of small, vibrant flowers. The flowers include a yellow flower, a pink flower, and a white flower with a yellow center. The background is green, indicating a natural setting.


Brilliant, thanks a lot
I'm also using Open AI.
Have also seen that some input sometimes is needed for some pictures.

It is very good that we can combine general input like yours at the Description anf Keyword elements in the AutoTagger setup with more specifik input in the Autotagger dialog window.

Stenis

Thanks for your input again Monochrome. It helped a lot the Description texts got clearly a lot more precise and hallucination free,

What I did myself to get a better set of keywords was to add that I don´t want place info or info about the year a picture was taken.

In Preferences\AutoTagger I added:

[[-c-]] Return five to ten keywords describing this image.
Use simple English, common words, factual language.
Don´t save place data or time info.  (It helped a lot)


Stenis

#4
The more I use iMatch and AutoTagger I think this is a real game changer that will take metadata editing to a whole new level.
AutoTagger and OpenAI has a great potential being flexible enough to meet almost all my expectations and demands.
On the whole it is nothing but fantastic.
With the general and specific promting together I do with AutoTagger the quality of the results it is far more than good enough for me now.

I feel it is really a great relief with iMatch and AutoTagger, since adding metadata to thousands of images is no small feat - manually - even with efficient non-AI-driven tools like PhotoMechanic.
I value my time a lot - I just turned 75 - and who knows, I might even live longer not needing to spend so much time as I do and have done on metadata management.
... and I will have far more fun too.

I will still have some use for PhotoMechanic too, because PM is still the most efficient tool to use to batch and add basic static data - for me I see them a lot as complementary.
iMatch and PhotoMechanic works perfectly together after I have complemented the Default layout in iMatch with what it was lacking for me.
So I can definitely say now that I´ll migrate to iMatch when my 30-days copy will run out and use it as my main DAM.

I´m also looking forward to use iMatch even for my Office-documents and my PDF-files.
I have long tried to convince Camera Bits to add support even for textfiles but without any success.
Now I can leave that I guess and focus on the job that needs to be done.

Exiting time it is! I love it and it is very nice to learn a few new efficient tricks!

iMatch is one more example that proves that software often these days are far more interesting for me than new cameras and lenses.
... and it gives so much more for the money I spend than more hardware I don´t really need.
Strangely enough a majority of all these photo journalists and photo entusiasts seems to be overlooking the power of great softwares.

Mario


QuoteI will still have some use for PhotoMechanic too, because PM is still the most efficient tool to use to batch and add basic static data - for me I see them a lot as complementary.
Have you worked with Metadata Templates and especially AutoFill yet?

Running a Metadata Template when new (or updated) files are added to the database usually takes care of the stationary data like copyright notices, legals, artist/author etc.

And with AutoFill you can fill any number of tags with any amount of data very quickly.

QuoteSo I can definitely say now that I´ll migrate to iMatch when my 30-days copy will run out and use it as my main DAM.
I'm happy to hear that IMatch does what you need. GIMME you money ;)


QuoteStrangely enough a majority of all these photo journalists and photo entusiasts seems to be overlooking the power of great softwares.
The problem is marketing and awareness. Adobe spent 2 billion (!) US$ in 2023 for marketing, sales and endorsements.
Bidding for any DAM-related keyword on Google is insanely expensive, because many big companies bid against you. Unaffordable for me.

When I write to a blogger or photo web site and ask nicely to maybe have a look at IMatch or review it, I often get no answer or just a list with prices for ads on their site back.

Sending out one (!) press release through a renowned press service can now cost up to 1,000$, and the results are meager. They typically send out 30,000 press releases per day - and a press release of IMatch is just drowned out.

I do my best to get good search engine rankings. In many cases, when users with the right profile (actually in need for a good DAM and not a toy) try out IMatch, they purchase a license.


QuoteWith the general and specific promting together I do with AutoTagger the quality of the results it is far more than good enough for me now.
It might not be perfect or always right. But it's much faster than writing descriptions and keywords oneself. And we can always read and improve them, when there is time. But using AI makes our images organized and searchable from day one.

Stenis

I made an update.
This seems to work for me generating flat nonstructured keywords in a form that works for me.

[[-c-]] Return five to ten keywords describing this image.
Use simple English, common words, factual language.
Only single word keywords.
Singular form
No geografic data or time info as keywords.
Never add keywords with only capital letters

Stenis

#7
Quote from: Mario on February 21, 2025, 08:25:33 AM
QuoteI will still have some use for PhotoMechanic too, because PM is still the most efficient tool to use to batch and add basic static data - for me I see them a lot as complementary.
Have you worked with Metadata Templates and especially AutoFill yet?

Running a Metadata Template when new (or updated) files are added to the database usually takes care of the stationary data like copyright notices, legals, artist/author etc.

And with AutoFill you can fill any number of tags with any amount of data very quickly.

QuoteSo I can definitely say now that I´ll migrate to iMatch when my 30-days copy will run out and use it as my main DAM.
I'm happy to hear that IMatch does what you need. GIMME you money ;)


QuoteStrangely enough a majority of all these photo journalists and photo entusiasts seems to be overlooking the power of great softwares.
The problem is marketing and awareness. Adobe spent 2 billion (!) US$ in 2023 for marketing, sales and endorsements.
Bidding for any DAM-related keyword on Google is insanely expensive, because many big companies bid against you. Unaffordable for me.

When I write to a blogger or photo web site and ask nicely to maybe have a look at IMatch or review it, I often get no answer or just a list with prices for ads on their site back.

Sending out one (!) press release through a renowned press service can now cost up to 1,000$, and the results are meager. They typically send out 30,000 press releases per day - and a press release of IMatch is just drowned out.

I do my best to get good search engine rankings. In many cases, when users with the right profile (actually in need for a good DAM and not a toy) try out IMatch, they purchase a license.


QuoteWith the general and specific promting together I do with AutoTagger the quality of the results it is far more than good enough for me now.
It might not be perfect or always right. But it's much faster than writing descriptions and keywords oneself. And we can always read and improve them, when there is time. But using AI makes our images organized and searchable from day one.


I have looked into using Layout Templates and that will certainly work but I see a problem with it compared to what I´m used to from PM. In PM I design a form with all elements in my forms (one for a single record at the time amd one for batching). Unlike in iMatch it is much easier in PM to get a quick overview i PM which makes it much easier to avoid mistakes and to add variables is as simple as to add the IPTC-names with brackets like {event} {country} {location} in a field and run it. No scripting needed for me at all.

It is also possible to save this whole template as so called Snapshots with all the fields where just some active (checkboxes) and that is a very effectice way to do it that suits me very well. So from what I have seen I prefer to work the way PM Plus let me. I think I will create more update mistakes with the iMatch design. I´m sure some people prefer the way iMatch is designed but for now I´m not convinced. The way I do it in PM has served me very well the last four years plus. Autofill I have to study more closely in the documentation.

Don´t worry Mario, I will throw my money on you in a few weeks!
It will be my best spent money on photo gear for years, I think.

I´m very impressed by iMatch and it is constantly growing in front of my eyes the more I see of it.
.... and I´m very impressed over that you seem to do most of it or all of it by yourself and so successful that you really can give any of the competition a very hard match and succeed in being many users first choice today. That is impressive. I can only recall one example of anything like it in Hamerick´s Vuescan that is practically a standard application today for scanning far better than the crap the scanner manufacturers have managed på come up with. For example, Epson.

I think it is a shame how reviewers of photo gear is overlooking a lot of all this magnificent photosoftware there is out there when totally focusing on hardware. Maybe the reason is that it might be more demanding to write reviews on complex softwares like iMatch than for a lot of the hardware. It is also sad with the dominance of Google but very little really to do about it and their algoritms in a short run but I see very substantial support for iMatch  outside the Adobe sphere. I have seen a surprisingly strong support among DXO Photolab enthusiasts' and other users.

I have left the Adobe sphere since more than 10 years and today I feel I also like to support the European software industry. I use Capture One (which is Danish with som Swedish capital today), DXO Photolab is French and your iMatch is German and it is also nice to say that I have chosen these three not just because they are European but because I think they are the best there is in their respective market niches for me and many others.

I don´t know how you do it but it is hard to understand that your days has just 24 hours. You manage both to develop top notch software and conducting a support that is even better than the one they provice at Camera Bits with PhotoMechanic.

Good night! :-)

sinus

Stenis, 
thanks for your prompts.

Very helpful. 
Best wishes from Switzerland! :-)
Markus

Mario

QuoteI have looked into using Layout Templates and that will certainly work but I see a problem with it compared to what I´m used to from PM. In PM I design a form with all elements in my forms (one for a single record at the time amd one for batching). Unlike in iMatch it is much easier in PM to get a quick overview i PM which makes it much easier to avoid mistakes and to add variables is as simple as to add the IPTC-names with brackets like {event} {country} {location} in a field and run it. No scripting needed for me at all.
I haven't used PM in a decade, so I don't really follow what you mean with forms and variables and stuff.
Maybe show me a screen shot or a link to some documentation. I'm pretty sure IMatch has comparable features.

The features in IMatch to automatically file in metadata are Metadata Templates. They can add metadata, change metadata, delete metadata, copy metadata, assign collections and categories. Very powerful.

You can configure IMatch to apply Metadata Templates to new files, or updated files, which usually automates most of the initial tagging and captioning work.

You can also apply Metadata Templates via Favorites and run them, by clicking them in the Favorites Panel or via a keyboard shortcut. Or add them to the File Window ribbon for even quicker access.

To maintain large chunks of stationary data (think: Copyright & Legal, Author, Complete Location records, Darwin Core metadata sets, ...) AutoFill is unbeatable. Without ever leaving the Metadata Panel you can fill the entire location block, keywords, Artist info, large chunks of other metadata with minimal effort.

I think you also posted and hijacked another thread about how Copy & Paste works in the Metadata Panel (I have moved your unrelated post to a new post where it belongs: https://www.photools.com/community/index.php/topic,14940.msg104605.html#msg104605.

The Metadata Panel offers very sophisticated features to copy metadata between files, in a controlled fashion. And you can create any number of Metadata Panel layouts (PM forms?) to fit different stages in your workflow: Metadata Panel Layouts, it even supports the complex structured in modern IPTC metadata.

Stenis

#10
The problem with the iMatch way to do it in the interface is  that I can't get an overview of the whole "Default"-layout I have copied and the data it contains for the moment without clicking on each and everyone of the dataelements.

In PM this whole strukture is unfolded and open så I direct can see what will get updated and not if I run it. Not so in iMatch. Even with the transparency in PM it has happened that I have done mistakes and I just not feel comfortable as it looks in iMatch as I so far have understood it is working.

I have no problem with using PM tools if they suit me better. I use DXO Photolab generally because of the superiour image quality but PL is just sooo ineffective and cumbersome when more detail work is needed. Then Capture One is a far better choise with its smart, effective AI-driven masking tools. The same job I might spend an hour with in PL kan take just a few seconds or a minute in Capture One.

I'm very pragmatic about the tools I use. I guess I'm pretty destroyed by working with improving IT-workflows for decades as a developer. It is not more complicated than it has been to use PhotoMechanic as a photo-DAM than using the pretty simple PictureLibrary in Photolab. When scaling is needed interoperability gets really important. PM and PL also integrates totally seaamlessly in a way Capture One and other converters with specific picture import processes can't do.

Today Photolab is pretty unique working straight on the files in the file system. iMatch can in some respects not integrate as seamless as PM does with PL. In fact I personally initiated  that by asking Kirk Baker at Camera Bits to help us solve these integration problems we had earlier and it has been very timesaving for us using it.


Mario


QuoteThe problem with the iMatch way to do it in the interface is  that I can't get an overview of the whole "Default"-layout I have copied and the data it contains for the moment without clicking on each and everyone of the dataelements.
I don't understand this. What do you mean by Default layout or data element? A screen shot perhaps?

Of course use whatever tool works best for you. I just want to make sure you're not missing some features in IMatch which would save you time.