Thesaurus - to group/exclude or not to ....?

Started by DigPeter, July 19, 2013, 05:44:49 PM

Previous topic - Next topic

DigPeter

Help page "Universal Thesaurus" explains Group and Exclude levels as follows:

Using Group Levels
This is where group levels come into play. When you enable this property for an element in the thesaurus, IMatch does not include this element anymore when producing keywords from thesaurus entries.

When we mark the WHERE element in our above example as a group level, several things change:

1. The WHERE element is displayed with a different icon in the Thesaurus Manager

2. WHERE is displayed with brackets in the Keyword Panel to indicate that it is a group level

3. When you insert a keyword from the thesaurus, WHERE is not longer included in the keyword. Inserting the keyword Daytona now results in the following keyword:

beach|DaytonaWHERE is a group level and ignored when creating hierarchical keywords from thesaurus entries.

You may want to use group levels when you use the classic 5 W's at the top of your thesaurus, or any other type of ordering which creates elements and levels you don't want to include in keywords.

Excluding in Flat Keywords
This property is only available for thesaurus elements for the special Keywords entry. It works similar to the group level property, but it is applied when IMatch copies (migrates) hierarchical keywords into XMP and IPTC metadata. When you enable this option, IMatch ignores the corresponding element when copying keywords from the internal hierarchical representation into the non-hierarchical IPTC and XMP keywords.

The difference between the group level property and the exclude in flat keywords property is when they are applied:

Group level elements are already excluded when a hierarchical keyword is assigned from the thesaurus. They don't show up in the Keyword Panel at all.
Excluded elements are skipped when IMatch copies hierarchical keywords into the flat IPTC and XMP keywords. They show up in the hierarchical keywords, but not in IPTC and XMP keywords.


In Thesaurus Manager, the note against Exclude states that this option is applied only when Group is not enabled.

My understanding is that:

- Group level applies only when entering keywords in the Keyword Panel.
- Exclude level is applied when importing/ingesting files which already have keywords in their metadata.
- BUT only one or the other can be in use at any time.  Thus, for creating a database by adding existing files with embedded keywords, Group must be off and Exclude on.  When these files have been ingested and write-back completed, Group must be on, but Exclude can be off or on.

Is this correct?

Ferdinand

Quote from: DigPeter on July 19, 2013, 05:44:49 PMMy understanding is that:

- Group level applies only when entering keywords in the Keyword Panel.
- Exclude level is applied when importing/ingesting files which already have keywords in their metadata.
- BUT only one or the other can be in use at any time.  Thus, for creating a database by adding existing files with embedded keywords, Group must be off and Exclude on.  When these files have been ingested and write-back completed, Group must be on, but Exclude can be off or on.

Is this correct?

First of all Peter, as we have discussed in other threads, there are bugs in the way that Imatch 5.0.102 looks up flat keywords in the thesaurus, and so you can't use existing behaviour to understand this.

I think it is a little simpler.  It is easiest to understand from how keywords are written.  The quote from the help file is all about writing keywords:

~ Both Group and Exclude affect how keywords are written
~ Group will mean that that node is NOT written to either the hierarchical keywords OR flat keywords.  It's only displayed in the thesaurus as an organisational device.
~ Exclude will mean that the node IS written to hierarchical keywords but NOT to flat keywords

So if you think about it, if you enable Group, then there's no point enabling Exclude, as you've already excluded the node from flat keywords, and then some.  Group is a stronger form of Exclude.  (It was I who suggested to Mario that if Group is enabled, then the Exclude option in the thesaurus should be disabled for that node, so that people don't get confused.)

Now suppose that you import a file containing a flat keyword of "Daytona" in IPTC or XMP, but no hierarchical keywords.  If you enable the option to look up keywords in the thesaurus, then IMatch will find the thesaurus entry containing "Daytona" (not sure what happens if there is more than one).  The help file says:

"A file contains the keyword "Daytona" and your thesaurus contains an entry for the hierarchical keyword "Location|Beach|Daytona", IMatch maps the keyword in the file to this hierarchical keyword automatically."

There's no mention of Group or Exclude here, and so I assume that they don't matter in terms of the initial lookup.  The lookup should work in either case. My understanding is that they will matter in the subsequent write-out of the keyword at a later time, since they will determine what gets written where, as per above.

But we can't test the import direction until that bug is fixed.

DigPeter

Quote from: Ferdinand on July 20, 2013, 09:59:29 AM
First of all Peter, as we have discussed in other threads, there are bugs in the way that Imatch 5.0.102 looks up flat keywords in the thesaurus, and so you can't use existing behaviour to understand this.
Thanks Ferdinand for the full response.  Yes - I was trying to understand the difference between 'Exclude' and 'Group', which I think I have now done.  My previously reported bug ( https://www.photools.com/community/index.php?topic=356.0 ) relates only ingesting images with keywords already embedded (e.g. 'Exclude'). 

Quote
There's no mention of Group or Exclude here, and so I assume that they don't matter in terms of the initial lookup.  The lookup should work in either case. My understanding is that they will matter in the subsequent write-out of the keyword at a later time, since they will determine what gets written where, as per above.

Thanks again - your explanation is very clear.  A small test with a file already in the data base gives these results with hierarchical keyword A|B, when applied through thesaurus:

        Applied to top node   
    Exclude flat kw   Group level     Keyword      Category panel

         no                     no                A|B                A|B
         yes                    no                A|B                A|B
         no                     yes                B                   B

Quote
But we can't test the import direction until that bug is fixed. 
This test throws up another potential bug.  I think that in the last example, the top node should be retained in the Category panel and read A|B.  I do not know if this is a manifestation of the original bug.


The thought at the back of mind when writing the OP was that having 2 situations, with either 'Exclude' or 'Group' as 'yes' would mean that these properties would need to be changed to fit whatever is being done at the time.  Assuming that the normal state is 'Group' for applying keywords direct onto the keyword panel from thesaurus, the node properties would need to be changed to 'Exclude' whenever files with embedded keywords were imported.  This might not be too onerous if it only involves the top nodes of the 5-Ws, or if the need to import new files with embedded keywords were occasional.  In my case, I have at present 16 top and 2ry nodes which need 'Exclude' or 'Group':  for instance Subject|Building with leaf nodes of different types of building.  I do not want Subject and Building in flat keywords, but I do want them in the Category panel.  (Perhaps I need to review my category structure!?)

I wonder why it is necessary to have both 'Exclude' and 'Group'.  I would have thought that people want the same result whatever method they use to apply keywords.

Ferdinand

Peter -  What appears in the keyword panel should correspond exactly with what appears under @Keywords in the category view or the category panel.  They are the same thing, or should be.  As your test shows.

Your test misses the point of Exclude, which is what happens to flat keywords (IPTC:Keywords & XMP:Subject).  You won't see these fields under keywords and probably won't see them in the category view.  You'll need to look in the metadata browser, or create your own metadata layout that shows these fields, or examine it in exiftoolgui.  My version of your test is

       Applied to top node   
    Exclude flat kw   Group level     Keyword
      Flat Keywords
             no            no           A|B           A  B
             yes           no           A|B             B
             no            yes           B              B


In both Group and Exclude, "A" is not written to flat keywords.  In Group, it isn't written to hierarchical keywords either - it only appears in the thesaurus, as an organisation device.  That's the difference.

Given this, I haven't replied to the rest of the post.

Ferdinand

Quote from: DigPeter on July 20, 2013, 06:52:19 PMI do not want Subject and Building in flat keywords, but I do want them in the Category panel.  (Perhaps I need to review my category structure!?)

I thought about this a little more.  I'm not sure I understand all your last paragraph, but the answer to this quoted question is that you want Subject and Building to be Exclude and not Group.  But two points to note:

1. This applies when assigning keywords from the thesaurus.
2.  It will mean that you get "Subject|Building|xxxx|yyyy" in the XMP:HierarchicalSubject field, even though you won't get Subject and Building in flat keywords.

I suspect you want to import files with just flat keywords and have the link made to the thesaurus immediately using the "lookup keywords in thesaurus" option, right?  Will this work?  I don't know and I can't find out until that bug in this option is fixed. 

If it won't work even then, the solution to your problem is to use my 3.6 to 5 keywords migration script in IMatch 3.6 before you import the files into IMatch 5, but I would need to give you some additional tips.

DigPeter

@Ferdinand
Thanks for the last 2 posts.  I have now done my test again with a new data base, and the Exclude test now complies with yours.  I suspect that the problem with the previous database was that it had been used for a number of different purposes and had become a bit of a mess.  This brings me back to the wording of the help file which I quoted in the OP.  The part concerning 'Exclude' is repeated below:

QuoteExcluding in Flat Keywords
This property is only available for thesaurus elements for the special Keywords entry. It works similar to the group level property, but it is applied when IMatch copies (migrates) hierarchical keywords into XMP and IPTC metadata. When you enable this option, IMatch ignores the corresponding element when copying keywords from the internal hierarchical representation into the non-hierarchical IPTC and XMP keywords.

The difference between the group level property and the exclude in flat keywords property is when they are applied:

Group level elements are already excluded when a hierarchical keyword is assigned from the thesaurus. They don't show up in the Keyword Panel at all.
Excluded elements are skipped when IMatch copies hierarchical keywords into the flat IPTC and XMP keywords. They show up in the hierarchical keywords, but not in IPTC and XMP keywords.

I read the words in red highlight as referring only to importing and ingesting files.  This would seem to be a misinterpretaton.  As I was not absolutely clear about this, I asked the question at the end of the OP.  I think it is now clarified.

Ferdinand

My reading of this is that when you assign a (hierarchical) keyword using the thesaurus, IMatch also writes flat keywords to XMP/IPTC as well as the hierarchical one (if this option is set), and Exclude (and Group) control how these flat keywords are written.  But the phrase "copies (migrates)" might be ambiguous to some people.  If you think so, then perhaps you could make a suggestion to Mario about how to word this.