Importing Custom Thesaurus

Started by Pickwick Kelly, May 16, 2022, 10:41:30 PM

Previous topic - Next topic

Pickwick Kelly

Hi. I'm trying to import my custom thesaurus into iMatch. I don't understand how to get my data into the right file format for import.

My data is comma-delimited Excel sheet with column 1 the destination word (which is populated in the images metadata), and column 2 is list of comma separated synonyms (including the word in column 1.)

The Help section says it needs to be either a native XML-based IMTHS format (.imths file extension) , an IMatch 3 IPTC thesauri created in IMatch 3 (.dat file extension) or IMatch 3 Category schemas (.imcs extension).

I don't understand what I need to do to convert my data into a format that can be imported.

Can anyone offer any advice please? I would be very appreciative.

jch2103

The Help does also discuss importing text data: https://www.photools.com/help/imatch/thes_basics.htm?dl=h-41

You'll need to format your data into a csv text file. Unfortunately, it looks like the link in Importing Text Format may be broken. Mario may need to fix this.

The format looks like this:

[2 - "What"]
[Nouns]
animals
{animal}
{wild life}
{wildlife}
[1- number]
flock
herd
mates
amphibians
frog
horned frog
newt
salamander
spadefoot toad
bird
{birds}
{feather}
{feathers}
barking owl
barn owl
budgerigar
chicken
cockatiel
cockatoo
condor
dove
{doves}
{pigeon}
duck
eagle
emu
falcon
flamingo
geese
hummingbird


Hope this helps. Please let us know if you have more questions.
John

Mario

The format is explained in the Thesaurus help under the Export to Text topic, and the Import Text refers to that.

If you are unsure about the correct format, just create a keyword with synonyms in the IMatch thesaurus, maybe a group level, and export it to text. The resulting format is what IMatch can also re-import.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Pickwick Kelly

Thanks! That's v helpful. Just one Q. If I install the iMatch thesaurus, and I then also import my custom thesaurus will I be able to switch/select which to use, or will my custom thesaurus text file data be merged with the iMatch one? I want my thesaurus data to be the categories I've set up in my Excel spreadsheet, and not added to by additional thesaurus data.

Mario

There is only one thesaurus. It can hold a virtually unlimited number of elements for each metadata tag (keywords just being one tag).
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Pickwick Kelly

Thanks! Apologies, but I'm still not quite clear. What I'm trying to achieve is only the metadata I have applied to each image, being associated with only the synonyms I have selected in my custom thesaurus. Without other synonyms from the iMatch thesaurus also linking to my metadata. Is that possible? So the custom thesaurus data I upload is the only thesaurus data present in the thesaurus.

Mario

I'm not sure that I follow.

When you assign a keyword to a file, all synonyms of this keyword are also assigned. That's how it must be.
Where do you select synonyms? You cannot select synonyms in IMatch...?
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

jch2103

Quote from: Pickwick Kelly on June 04, 2022, 10:24:12 PM
Thanks! Apologies, but I'm still not quite clear. What I'm trying to achieve is only the metadata I have applied to each image, being associated with only the synonyms I have selected in my custom thesaurus. Without other synonyms from the iMatch thesaurus also linking to my metadata. Is that possible? So the custom thesaurus data I upload is the only thesaurus data present in the thesaurus.

It might be helpful if you could provide a short example from your spreadsheet of what you're trying to implement.

Do I understand correctly that you want to import only your custom thesaurus information and not include anything from the one that is available from IMatch (i.e., completely replace the IMatch thesaurus with your custom one)? I haven't done a thesaurus import lately, but I believe there's an option to either replace or merge thesaurus entries. Would this do what you want?
John

Pickwick Kelly

Thanks both! Re the last post: Yes. You do understand correctly. Is there an option to replace rather than merge thesaurus entries? That is, to remove all existing thesaurus data and replace it with the imported custom data. Because hopefully that will do what I want.

Mario

QuoteIs there an option to replace rather than merge thesaurus entries?

When you import a thesaurus, the Thesaurus Manager lists all tags contained in the thesaurus you import.
And asks whether or not you want to merge or replace. Does this not happen for you?
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Pickwick Kelly

Thanks very much for previous help. Much appreciated. However I'm still not managing to get the import to work. I have managed to manually add a few synonyms, and exported them as .txt. Then added additional synonyms, and then re-imported the data. 

All good so far. Shows up as tree in screengrab Screen1, attached. Car is the new entry with its synonyms, added prior to re-importation. So I expected to be able to Search for 'jalopy' and for the results for 'car' to be displayed. Instead the Search doesn't produce any results. Then I tried searching for one of the synonyms that were manually entered to begin with. And that search also didn't work.

When the database was first created I unchecked the Thesaurus default box. I wondered if that was the problem. Do I need to enable the Thesaurus? Under the Edit / Preferences menu I rechecked the two items in the Keyword Lookup section to look up Thesaurus and Assign Synonyms in screengrab Screen2. But still the search for 'jalopy' doesn't produce the results for 'car'. Is there something else I need to do to get the Search to recognize the synonyms - because at the moment Search can't find any of the synonyms - either the manually input ones or the imported additions? 

Mario

QuoteWhen the database was first created I unchecked the Thesaurus default box.
Why? These options are on by default for a reason.
But these options don't impact the contents of your thesaurus, just how existing keywords imported from your files are mapped into hierarchical keywords. See Lookup keywords via thesaurus for detailed information and advice.

QuoteSo I expected to be able to Search for 'jalopy' and for the results for 'car' to be displayed.
You search where?
In the Thesaurus Manager? It definitely searches also in synonyms.
Or the File Window search bar? The search bar searches the keywords in your files. If the synonyms where not added to the files, they cannot be found.

Note that changing the contents of your thesaurus does not affect existing keywords in your files. If you add synonyms to the thesaurus, they will be used from then on, not backwards in time.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Pickwick Kelly

Thanks! Apologies, I think should have explained more clearly what I am trying to do. 

I can search using the File Window Search Bar for, say, one of the metadata terms attached to the images, which then returns results for all those images with that comma delimited term - eg 'car'.

Then what I want to do is to import a custom thesaurus that is a text file containing a sub-set of frequently used metadata keywords (used in the image metadata) and also adds synonyms to these - eg 'car', but also 'automobile', 'jalopy', 'auto' etc. I think I may have worked out how to do this now. 

And then what I want my users to be able to do is to Search using iMatch Anywhere for one of the words that is NOT in the individual image metadata, but which IS in the Thesaurus. - eg 'automobile'. They may not think to search for 'car', but might instead Search for 'automobile'. None of the images have 'automobile' in their metadata but they do have the word 'car'. I want to them to be able to Search in iMatch Anywhere for 'automobile' and have it bring back all the images with 'car' in their metadata.

Could you suggest a way I might achieve this?

Mario


QuoteSearch using iMatch Anywhere for one of the words that is NOT in the individual image metadata, but which IS in the Thesaurus. - eg 'automobile'.
This cannot work. IMatch searches the metadata (in your case, keywords) in the database. It does not search in the thesaurus.

If you add the synonym "automobile" to the keyword "car" and you then assign the keyword "car" to a file, IMatch also assigns the keyword "automobile". That's the purpose of synonyms - to add multiple keywords by adding one keyword. The same is true for keyword links, but they are even more powerful than synonyms.

IMatch and IMatch Anywhere have no functions that include the thesaurus in searches performed by users. They can only search for actual keywords stored in the file (or database).

In almost all cases, the synonyms used in the thesaurus are also part of the keywords so this works automatically in most cases. If you have synonyms only in the thesaurus but not assigned as keywords to files, the search cannot find them.

This is not a search engine like Google which maintains extensive synonym lists and atomic word stems for returning results for words similar to the words a user has used to search.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Tveloso

As Mario says, any Synonyms you have added to your Thesaurus are not automatically added to the images that have the original Keyword that the  Synonyms have been added to. But the DYK app contains an article describing the procedure for doing that very thing (it's topic #40 in the App's Table of contents):

   

And Mario has announced that IMatch 2023 will deliver that functionality integrated within the Thesaurus Manager itself:
https://www.photools.com/community/index.php/topic,13150.0.html
--Tony