Keywords and the Thesaurus

Started by rolandgifford, January 31, 2023, 01:48:07 PM

Previous topic - Next topic

rolandgifford

Are there any utilities/reports that will tell me any Keywords which aren't in the Thesaurus?

I know that I can build/update the Thesaurus from Keywords, I'm looking for effectively a 'Report Only' option for that utility. I don't want to actually update the Thesaurus as the ones which don't match are wrong.

I'm currently going through Keyword housekeeping, primarily bird species. I have created a Thesaurus tree for all current/valid species based on the IOC taxonomy. This changes twice a year and I intend replacing that part of my Thesaurus in line with that. The imported Thesaurus has some problematic characters (ones with umlaut and the like) which I am replacing with the plain text version of that character as I come across them.

I'm looking for something to tell me if I have assigned that changed value to a photo so that I can manually change the Keyword to match the new value in the Thesaurus. I'm not changing these Thesaurus values one at time, I'm doing a search/replace on the import file and re-importing. Often 20-30 values for each character I come across.

Mario

There is no built-in feature to figure out keywords assigned to files but not in the thesaurus.

Such a feature would have to consider group and exclusion levels and probably flattening rules, under some conditions.
This can be quite challenging and I doubt many users will ever see a need for this.

Usually the thesaurus comes first, then keywords are added. The idea is to not assign keywords not in the thesaurus, which is the idea behind a controlled vocabulary.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

rolandgifford

Quote from: Mario on January 31, 2023, 02:43:37 PMUsually the thesaurus comes first, then keywords are added. The idea is to not assign keywords not in the thesaurus, which is the idea behind a controlled vocabulary.

Controlled Vocabulary is what I'm implementing. My problem is that vocabulary changes twice a year so that something which is a valid keyword now potentially stops being a valid keyword in 6 months time. There are also the umlaut type problems as well where I will change the Thesaurus after importing it and may have used one of the changed entries without noticing.

I wasn't really expecting an option to validate existing keywords against the Thesaurus but no harm asking. Is there any way to export the keywords used to a text file? I can easily compare that against the Thesaurus also exported to a text file outside IMatch.
 

Mario

The thesaurus can be exported to XML and text.
You can export it XMP to save it.
XMP is required to export all features and properties of thesaurus nodes and to restore your thesaurus.

Export it to text.
Then import all keywords.
Then export it again.

Make sure to press CANCEL to undo all changes. Else you will have to restore it from the XML backup.

Use a merge tool to compare the two variants.

To learn how to export and import Thesaurus data, just open the Thesaurus help topic in the IMatch help by pressing <F1> while the thesaurus manager is open or by searching for thesaurus in the Help search box.
Then Ctrl+F in your browser to search for export or just open the table of contents on that page:

Image4.jpg
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

rolandgifford

Quote from: Mario on January 31, 2023, 06:14:00 PMMake sure to press CANCEL to undo all changes. Else you will have to restore it from the XML backup.

Use a merge tool to compare the two variants.

That will work, my favourite text editor will be happy with the compare. I wouldn't have thought of pressing cancel to avoid updates and would have simply restored the backup instead.

Thanks as always

rolandgifford

Is there an option to prevent using a Keyword which isn't in the Thesaurus? I believe that there isn't but it would be useful if there were the option and many 'would be useful' options are already there.

I have found the option which adds new manually entered Keywords to the Thesaurus and have turned that off.

Mario


QuoteIs there an option to prevent using a Keyword which isn't in the Thesaurus? 

No.

If this is a concern, best avoid typing in keywords manually and just pick them from the thesaurus in the Keywords Panel.
Or use only keywords which show in the auto-suggestion list offered based on thesaurus contents.

QuoteI have found the option which adds new manually entered Keywords to the Thesaurus and have turned that off.
This feature is off by default.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

rolandgifford

Quote from: Mario on February 02, 2023, 01:21:21 PM
QuoteIs there an option to prevent using a Keyword which isn't in the Thesaurus?

No.

If this is a concern, best avoid typing in keywords manually and just pick them from the thesaurus in the Keywords Panel.
Or use only keywords which show in the auto-suggestion list offered based on thesaurus contents.


I'm selecting from the auto suggestions presented after entering part of the keyword but am finding that I occasionally add the part typed keyword accidentally by "fat fingers/accidental mouse click" and blocking that would be useful.

Quote
QuoteI have found the option which adds new manually entered Keywords to the Thesaurus and have turned that off.
This feature is off by default.

I'd obviously turned it on at some point thinking it was a good idea :-)

Mario

So far no other user ever requested a feature like this.

Adding such an option would have to cover the Keywords Panel, but probably also users creating and using new @Keywords categories, Metadata Templates, Persons, Locations and other features which allow to add keywords.
This could get quite complicated and expensive and a maintenance horror quickly.

Feel free to open a feature request in the feature request board. We'll see how many other users have the same issue and if that number is substantial, I'll think about it.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

rolandgifford

Quote from: Mario on February 02, 2023, 01:44:02 PMFeel free to open a feature request in the feature request board. We'll see how many other users have the same issue and if that number is substantial, I'll think about it.

It isn't important enough to me for you to spend time developing something. My main query was whether this option already exists and I have missed it. I know you have the general guideline that if it doesn't exist in the help it doesn't exist in the software but searching and reading help effectivly isn't always my greatest skill.

This being important is primarily short term as I am migrating my existing keywords for birds (which are sometimes incorrect) into current/correct classification. I am about two thirds of the way through with not many hundreds of species to go so will have completed this significant task with the next week. Something to protect me from my own carelessness would have been useful if it already existed.

Mario

There might be a feature in IMatch 2023 which helps you with your regular thesaurus updates. I shall talk about that in the Sneak Peek board at some later time.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Damit

Quote from: rolandgifford on February 02, 2023, 01:15:35 PMIs there an option to prevent using a Keyword which isn't in the Thesaurus? I believe that there isn't but it would be useful if there were the option and many 'would be useful' options are already there.

I have found the option which adds new manually entered Keywords to the Thesaurus and have turned that off.

Where is this option? 

I ask because I just imported some files that had keywords and somehow they made it into my Thesaurus with out me manually adding them.  This is quite disconcerting as I try to keep my Thesaurus and Keyword structure as clean as possible.  Any ideas why the imported keywords made it into my library?

Mario

QuoteWhere is this option? 

There is no option in the Keywords Panel not to add a keyword that is not in the thesaurus.
Usually users pick keywords from the thesaurus. And add new keywords not in the thesaurus if they have to. All other would be pretty limiting. You are in full control of the keywords you add.

QuoteI ask because I just imported some files that had keywords and somehow they made it into my Thesaurus

I don't know how this could happen. You can, via the Thesaurus Manager, import keywords from your files into the Thesaurus.
You can also enable the option in the Keywords Panel to add new keywords into the thesaurus. Did you enable this option? See Configuring the Keywords Panel. But this option only affects keywords added in the Keywords Panel. IMatch does not import keywords from files imported into the Thesaurus.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Damit

Quote from: Mario on February 24, 2023, 11:03:30 PMYou can, via the Thesaurus Manager, import keywords from your files into the Thesaurus.
You can also enable the option in the Keywords Panel to add new keywords into the thesaurus. Did you enable this option? See Configuring the Keywords Panel. But this option only affects keywords added in the Keywords Panel. IMatch does not import keywords from files imported into the Thesaurus.

Yes, the option was enabled.  I don't know how or when, but I have disabled it. I really don't know how they made it in the thesaurus. Maybe when you click on the file, the keywords populate in the keyword panel and then to the thesaurus if I had the aforementioned option enabled?  I just added a folder with some pics that had keywords and all of a sudden the keywords were in my thesaurus when I opened it.  As you said, the user should be in full control, I fully agree.

Keyword management has been full of caveats for me.  I will read the sections yet again and try to figure this out along with how to move keywords already recorded when you move the location of the keyword in the thesaurus.  Unfortunately life happened for 2 months and I was taken away from my recreation work in IMatch. I must re-familiarize myself. Thanks, as always, for your response!

Mario


QuoteMaybe when you click on the file, the keywords populate in the keyword panel and then to the thesaurus if I had the aforementioned option enabled?
No. When this option is enabled and you add new keywords in the keywords panel, they are added to the thesaurus if they don't already exist.

QuoteKeyword management has been full of caveats for me. 
It's usually pretty easy. Setup thesaurus. Use thesaurus. Occasionally, add new keywords to the thesaurus.
Unless you to complex or unusual things, that's basically it.
My personal thesaurus was created with IMatch 5 in 2015 and since then used and maintained.


Quoteto figure this out along with how to move keywords already recorded when you move the location of the keyword in the thesaurus.  
To rename or move keywords in files, use the @Keywords category hierarchy. Adapt your thesaurus manually as needed afterwards.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Damit

Thank you for your input. It should be easy, but for me I have gone through some growing pains, most usually due to user error, but I am trying!

I re-read the @keywords and I now realize that is where I want to make all my changes in my keywords. It is great that all the moving and renaming you do there is reflected in the keywords of the files therein. For some reason I was trying to do this through the Thesaurus. Now I have to re-think if I want to have categories with zero files to display in the @keywords category, as enabling this will allow me to see all the structure at once but it might be cumbersome.

Are the changes made in @keywords reflected in the thesaurus? I would think not, as the @keywords lists all keywords, not just those in your Thesaurus.

How would one achieve synchronization between the changes made with @keywords, and the Thesaurus? By manually changing both data sets, or possibly only changes made in @keywords for items already in the Thesaurus are reflected therein?

thrinn

Quote from: Damit on February 25, 2023, 08:52:29 PMAre the changes made in @keywords reflected in the thesaurus? I would think not, as the @keywords lists all keywords, not just those in your Thesaurus.
No, they are not. See the last answer in Mario's post. @Keywords are only linked to the keywords in your files. They are not linked to the Thesaurus anymore. There are some special advanced options regarding Thesaurus entries (e.g. Links) that can not be "mirrored" in the comparatively simple XMP keywords. The Thesaurus has much more possibilities than keywords (and is, by the way, not restricted to the hierarchical keywords metadata).

This also means that there is no automatic "synchronization". You can fill the Thesaurus from data in your files, but I would recommend to use this only for a first time setup. As soon as you have created a carefully maintained Thesaurus, you don't want to pollute it with inconsistent data from you file. I am speaking from experience...
Thorsten
Win 10 / 64, IMatch 2018, IMA

Damit

#17
Thanks Thrinn!

I appreciate your input. OK, so I guess I must be meticulous and if I move something in @keywords, I must always manually change the thesaurus to reflect that change.

I guess your last sentence indicate you too, have had trouble keeping your Thesaurus clean.  I guess that is why it is good to keep lots of backups and back up often, especially after changes.

I am still concerned about how those keywords made it to my thesaurus. I think I will keep another database solely for newly imported files so I can strip them of any keywords before importing them into IM. I will also try to do some tests so I can replicate what occurred.

Lastly, I cannot get the @keyword category to show the empty categories in my Thesaurus, even after refreshing the data-driven categories, doing a database maintenance and restarting IM.  Obviously I cannot drag and drop if they do not populate.  I am not sure why this is occurring.

Mario

#18
Quote from: Damit on February 25, 2023, 10:47:02 PMLastly, I cannot get the @keyword category to show the empty categories in my Thesaurus, even after refreshing the data-driven categories, doing a database maintenance and restarting IM.
@Keywords is created from the actual keywords in your files. Dynamically. Independent from your thesaurus. Even if you have no thesaurus at all.

It does not know about empty groups or exclude levels or synonyms or links etc. you use in your thesaurus for keywords.
If you have 100 keywords in your thesaurus but never assigned them to a file, @Keywords does not know about them.

If you use group level keywords, they won't become part of the keywords stored in the files (that's their purpose, for organization only). Since group level keywords don't go into the file, they will never show in @Keywords either.

The thesaurus stores text for all metadata tags so the user can create lists of pre-made content to quickly access in the Metadata Panel and Keywords panel.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

sinus

Quote from: Damit on February 25, 2023, 10:47:02 PM...
I think I will keep another database solely for newly imported files so I can strip them of any keywords before importing them into IM. I will also try to do some tests so I can replicate what occurred.

I can remember, when I created a thesaurus (loooon ago), I dealed first on a test-database only with 2-10 images, to understand the system behind. I could see, what happens, if I delete a keyword-category, or add a thesarus key and so on.

And btw, do not forget, you can import and export the thesaurus.
And you can change/create the thesaurus list with a text-editor, this is what I have done at first, to create a good thesaurus for me.

Best wishes from Switzerland! :-)
Markus

Damit

Thank you Mario, for clarifying.  I see now that it is based on the files, not the Thesuarus, which explains why keywords created in the Thesaurus but not yet populated by files are not showing in the @Keywords listing. 

This makes a bit more difficult to manipulate and drag and drop, as the category will not be present in the tree.  I have to make sure the names are the same and that I assign and delete keywords appropriately in the keyword and category panel but I am getting the hang of it.

Sinus makes good points and I have spend time and money developing a good thesaurus, but somethings are still astray, but thanks to Mario's clarification, the way things work in IM is starting to coalesce more in my mind.