Assigning a category from 'xmp dc title'

Started by DigPeter, August 16, 2013, 11:53:15 PM

Previous topic - Next topic

DigPeter

I would like my files automatically to be assigned to a category of the same name as that contained in the metadata tag 'xmp dc title'.  I do not think that this is possible with the present build and it would probably be too specialist a requirement for it to be implemented in the future.  It is possible that this might be the subject of a script.  In IM3.6 I have a script for this (kindly provided, if I remember correctly, by Ferdinand).

Explanation

A large proportion of my images is of plants.  In IM5, the metadata tag 'xmp dc title' for each plant image contains only the scientific species name. The tag  'xmp: lr hierarchical keywords' contains 'plant type|family|species name'.  So, if the species name in 'title' is "Bellis perennis", the hierarchical keyword  is "Taxa flowering plants|Asteraceae|Bellis perennis".  Species names are unique.   The thesaurus also includes all these keywords in hierarchical format.

I wish to adopt a workflow for new images as follows:
- In the metadata panel, write the species name into dc:title - e.g. "Bellis perennis"
- Automatically assign each selected image in a batch to a matching hierarchical category in @keywords - e,g. "Taxa flowering plants|Asteraceae|Bellis perennis" 
- If there is no matching category in @keywords, thesaurus should be searched for a match.
- Failing a match in thesaurus, a new keyword would be created there and then assigned manually in the keyword panel.

Justification

- To save the need to enter the same informaition twice, particularly when there are many images to be be processed.
- To ensure accuracy and integrity between title and keyword and between images of identical species.






Richard

QuoteThe unique data-driven category concept of IMatch allows you to categorize your images based on EXIF, IPTC, XMP, ID3 and virtually all other metadata IMatch maintains in the database.

See "Data-driven Categories" in Help.

dcb

Richard's suggestion for Data-Driven categories is perfect for the data that's already there. I read from your post something different, in that your talking about new photos.

My suggestion is to flip your process order around a little.

1. Assign the species via the @keywords hierarchical category
2. For new species that are not yet in your database, add them to a new keyword in the @keywords hierarchical category (possibly before step 1 on each day you categorise, then no need for the next step)
3. Import the @keywords into the Thesaurus for easier assignment and consistency (need only be done when you feel like it)
4. Create a Metadata template to copy the leaf key of the keywords into the DC Title using

    {File.Categories|filter:^@Keywords\|Taxa flowering plants;level:leaf}

Then, you could also use the Data-driven category on the DC Title field to check for consistent assignment, spelling, etc.

You might consider your given order as IM3.6 thinking, whereas this is IM5 thinking.
Have you backed up your photos today?

Mario

Quote- Automatically assign each selected image in a batch to a matching hierarchical category in @keywords - e,g. "Taxa flowering plants|Asteraceae|Bellis perennis" 

IMatch always applies the keywords you set in the Keyword Panel to all selected images "in batch".

Quote- If there is no matching category in @keywords, thesaurus should be searched for a match.

When you add new keywords in the Keyword Panel, IMatch automatically creates matching categories under @Keywords.

Quote- Failing a match in thesaurus, a new keyword would be created there and then assigned manually in the keyword panel.

When I understand you correctly, IMatch should have an automatic which uses an arbitrary metadata tag (in your case dc:title) and then puts this text somewhere in the thesaurus, creates keywords in the file following some kinds of "where to put the keyword in the hierarchy" logic, add categories under @Keywords as needed etc. ? This is not possible.

You can cover most of it with a Metadata Template which appends your dc:title to the hierarchical keywords. This will automatically update @Keywords as well, because @Keywords is synchronized with the hierarchical keywords in the file. When you update your thesaurus from the database from time to time, these keywords will also be imported into the thesaurus.

But as dcb pointed out, you may do it the other way round, deriving the title from the keywords in a Metadata Template. This also avoids duplicate entries.

Check out the help on Metadata Templates for more details. Note that you can use variables to specify the values the Metadata Template fills in, which makes it easy to copy tags to other tags. And since variables support powerful formatting functions, you can manipulate the variable value before it is assigned to the output tag - which includes the ability to access only parts of your keyword hierarchy, like the leaf level (which is your title).

DigPeter

Quote from: Richard on August 17, 2013, 01:23:18 AM
QuoteThe unique data-driven category concept of IMatch allows you to categorize your images based on EXIF, IPTC, XMP, ID3 and virtually all other metadata IMatch maintains in the database.

See "Data-driven Categories" in Help.
Thanks - this produces data driven categories.  I require the categories to be in @keywords.

DigPeter

#5
Quote from: dcb on August 17, 2013, 03:20:19 AM
Richard's suggestion for Data-Driven categories is perfect for the data that's already there. I read from your post something different, in that your talking about new photos.
Thanks - that is correct.

Quote
My suggestion is to flip your process order around a little.

1. Assign the species via the @keywords hierarchical category
2. For new species that are not yet in your database, add them to a new keyword in the @keywords hierarchical category (possibly before step 1 on each day you categorise, then no need for the next step)
3. Import the @keywords into the Thesaurus for easier assignment and consistency (need only be done when you feel like it)
4. Create a Metadata template to copy the leaf key of the keywords into the DC Title using

    {File.Categories|filter:^@Keywords\|Taxa flowering plants;level:leaf}

Then, you could also use the Data-driven category on the DC Title field to check for consistent assignment, spelling, etc.

You might consider your given order as IM3.6 thinking, whereas this is IM5 thinking.
Thanks for this idea, which I will investigate.  Not being a technical person, I might need more guidance in due course ;D

DigPeter

Quote from: Mario on August 17, 2013, 10:04:31 AM
When I understand you correctly, IMatch should have an automatic which uses an arbitrary metadata tag (in your case dc:title) and then puts this text somewhere in the thesaurus, creates keywords in the file following some kinds of "where to put the keyword in the hierarchy" logic, add categories under @Keywords as needed etc. ? This is not possible.
Thanks, Mario, this confirms my thoughts.

Quote
You can cover most of it with a Metadata Template which appends your dc:title to the hierarchical keywords. This will automatically update @Keywords as well, because @Keywords is synchronized with the hierarchical keywords in the file. When you update your thesaurus from the database from time to time, these keywords will also be imported into the thesaurus.
I think that the problem here is that dc:title only has the leaf keyword, but I want it to fit into the hierarchy in @keywords and thesaurus.
Quote
But as dcb pointed out, you may do it the other way round, deriving the title from the keywords in a Metadata Template. This also avoids duplicate entries.

Check out the help on Metadata Templates for more details. Note that you can use variables to specify the values the Metadata Template fills in, which makes it easy to copy tags to other tags. And since variables support powerful formatting functions, you can manipulate the variable value before it is assigned to the output tag - which includes the ability to access only parts of your keyword hierarchy, like the leaf level (which is your title).
I will look into dcb's ideas. 

Richard

My suggestion was in response to "I would like my files automatically to be assigned to a category of the same name as that contained in the metadata tag 'xmp dc title'. " That sentence says nothing about keywords.

DigPeter

Quote from: Richard on August 17, 2013, 03:38:49 PM
My suggestion was in response to "I would like my files automatically to be assigned to a category of the same name as that contained in the metadata tag 'xmp dc title'. " That sentence says nothing about keywords.
Understood.

DigPeter

Quote from: dcb on August 17, 2013, 03:20:19 AM
My suggestion is to flip your process order around a little.

1. Assign the species via the @keywords hierarchical category
2. For new species that are not yet in your database, add them to a new keyword in the @keywords hierarchical category (possibly before step 1 on each day you categorise, then no need for the next step)
3. Import the @keywords into the Thesaurus for easier assignment and consistency (need only be done when you feel like it)
4. Create a Metadata template to copy the leaf key of the keywords into the DC Title using

    {File.Categories|filter:^@Keywords\|Taxa flowering plants;level:leaf}

@dcb
Yes, I think I can use this workflow, with a slight modification.  My primary source of species names is the thesaurus.  So, for me, the sequence will be:
1. Assign species from the Thesaurus.
2. For new species that are not yet in the thesaurus, add it in the appropriate place in the hierarchy, then assign it to the image(s). 
3. Write back will automatically add the new assignments to categories in @keywords.
4. as above.

This last step is the fixer!  I had not played with IM5's metadata template, because it was not a function which I used in IM3.6, where I used splasher when I wanted to add common categories in batch mode.

However this workflow will not enable an assignment of categories to a batch of images of differing species.  The IM3.6 script allows for this.  I am not sure yet whether this will be a problem.  In IM3.6, the time was spent entering the species names in the iptc object name field.  This will not be necesary with the proposed IM5 workflow, but grouping images by species, finding the name in the thesaurus and entering new names there, will take time.  With the excellent search facility in thesaurus, I am hoping this will not be too much of a problem.  I need to set up a test with a large number images.

DigPeter

After further testing I find that for species which have a synonym, both the species name and the synoym are written to dc:title by the template.  Is there a method of limiting this to the species name?

Mario

When you assign a keyword in the Keyword Panel via the thesaurus, the synonyms are also assigned and become regular keywords (keywords have no concept of synonyms). When IMatch mirrors the keywords of the file in the @Keywords category, your synonyms hence get their own child categories. And when you use the variable which returns the categories of a file, you get all the categories. You cannot do something like "filter out all categories which have been created by adding keywords from the thesaurus but which are in fact synonyms").

Your requirements are really special and maybe not everything you need can be archived directly.

DigPeter

Quote from: Mario on August 18, 2013, 12:58:04 PM
When you assign a keyword in the Keyword Panel via the thesaurus, the synonyms are also assigned and become regular keywords (keywords have no concept of synonyms). When IMatch mirrors the keywords of the file in the @Keywords category, your synonyms hence get their own child categories. And when you use the variable which returns the categories of a file, you get all the categories. You cannot do something like "filter out all categories which have been created by adding keywords from the thesaurus but which are in fact synonyms").

Your requirements are really special and maybe not everything you need can be archived directly.

I accept that this is a special requirement.  So I am left with the options of:  Have no synonyms; have two thesauri, one with the other without synonyms - import and export as needed - how clumsy can one get?;  some nice person offer to write a script.

Mario

IMatch is very flexible and can do a lot of stuff. But sometimes users want such a specific kind of functionality that it cannot be handled with all the tools at hand. We came up with so many approaches in this thread - if none of these fits your needs, a purpose-built script will allow you to do what you want.

DigPeter

Quote from: Mario on August 18, 2013, 09:05:30 PM
IMatch is very flexible and can do a lot of stuff. But sometimes users want such a specific kind of functionality that it cannot be handled with all the tools at hand. We came up with so many approaches in this thread - if none of these fits your needs, a purpose-built script will allow you to do what you want.
Yes - I do understand that.  Problem is I do not do scripting - somehow cannot get my head round it   :'(

Ferdinand

If it involves scripting and the thesaurus then it will have to wait, as we don't yet have scripting methods for the thesaurus.

DigPeter

Quote from: Ferdinand on August 19, 2013, 09:02:45 AM
If it involves scripting and the thesaurus then it will have to wait, as we don't yet have scripting methods for the thesaurus.
I can wait  ;D