The new @MetadataTag Category Formula

Started by Mario, November 05, 2017, 01:25:33 PM

Previous topic - Next topic

Mario

A recent discussion gave me an idea of how I can improve the usefulness of categories for many users. Thanks to the new @MetadataTag[] function we can now create categories which to things like:

+ Show all files with/without a title
+ Show all files with/without a copyright notice
+ Show all files with/without keywords
+ Show files with ISO values within a given range
+ Show files with a metadata tag matching/not matching a given regular expression
...
(This of course works with all metadata tags!)

The new function works similar to the metadata filter. And it is also a bit like a data-driven category which only produces one result category.
Here is an example:

"@MetadataTag[title,novalue]"

This formula returns all files without a title. And this

"@MetadataTag[cropped,regexp,^True]"

returns all files which have a cropped tag (XMP) with the value True.

"@MetadataTag[iso,between,0,400]"

builds a category which contains files with an ISO value between 0 and 400.

I used this formula to setup some basic Workflow categories in my test database:



Performance

I first tried to model these workflow categories using data-driven categories based on variables. This worked but was very sloooow for databases with 100,000+ files.
It also always produced the results as child categories, because this is the nature of data-driven categories. They produce child categories.

With the new @MetadataTag formula I could solve the second problem (direct result) and improve the performance situation considerably.
@MetadataTag uses specialized database routines and an intelligent cache which reduces the number of database queries to a minimum. This should be good enough for most usage scenarios. Just don't overdo it if you use this new formula  ;)

Jingo

Wow.. very nice Mario... are these persistent across session like the data driven categories and get updated each time new items are added? 

Mario

Quote from: Jingo on November 05, 2017, 02:02:02 PM
Wow.. very nice Mario... are these persistent across session like the data driven categories and get updated each time new items are added?
This is a normal category formula.  See the IMatch help for details on category formulas.
They are automatically re-calculated when metadata is changed (and during the database load).

thrinn

This sound really useful, especially to model some kind of quality checks (in the sense of "Are all fields/tags I defined as required for me really filled?"

Again IMatch getting better and better... ;D
Thorsten
Win 10 / 64, IMatch 2018, IMA

JohnZeman

I agree, I can see myself using this quite often.  ;D

jch2103

Very nice! IMatch was already the best tool available for quality control work on one's metadata, but this will make it even easier.
John

mastodon

OMG, that is what I need for checking metadata completeness.  :)

Mario

Quote from: mastodon on November 05, 2017, 06:50:14 PM
OMG, that is what I need for checking metadata completeness.  :)
Precisely. Just don't overdo it (add too many). Performance is as good as I could make it, though.

ubacher

How is this different from setting up a filter to collect similar files?

I can think of the following:
- works on/selects from all files(?) while a filter works only on the current set.

   

Mario

Filters are dynamically, they work on the current scope.

This formula always works on the complete database. You create the category once and it will be always there.
When you load the category into the file window you can use the search bar and the filter panel to drill down further. Like with any other category.

The primary use case for this new formula are workflow categories, quality control and "Where is still work to do"...

blackhead2

Currently I'm playing with this new feature and try to find files which show only one specific person (similar to this thread: https://www.photools.com/community/index.php?topic=7269.msg50493#msg50493). I already solved it with the @CatDistinct formula but tried to get the same result with the @MetadataTag formula.

My question is regaring the scope of the regular expression: Does Imatch apply the regular expression on each single tag separated by ";" or to the whole term?

For example I have a picture with a hierarchical tag "people|max; people|moritz" and one with only "people|moritz". I tried the formula "@MetadataTagraw[hierarchicalkeywords,regexpraw,^people|moritz$]" as well as "@MetadataTag[hierarchicalkeywords,regexp,^people|moritz$]" but both return both files. Due to the anchors (^$) I would have expected that only the second picture is returned.

So is there a possibility to apply the regular expression to the whole tag text or am I missing something?

Regards

Jens

Mario

You are working with a repeatable tag (hierarchical keywords).
These are stored in the database as individual values:

people|max;
people|moritz

and the regular expression is applied to each one individually.

blackhead2

Ah, ok. That's a pity. If it could be applied to the "whole" tag value it would be easily possible to find picures with various numbers of people using e.g. regexp-repeats. Where can I find some information about which tags are repeatable and which not? Or are just all tags which can contain multiple values repeatable automatically?

Mario

Tags which contain multiple values are repeatable values (depends on the metadata standard the tags come from).
IMatch must store each value individually - it does not store them as a comma-separated list or anything. The keyword panel just loads all values and shows them to you, separated with a ; typically.

Keywords is an example. See the official IPTC, EXIF, GPS, XMP, ID3, PDF, Office etc. standard documents to find out more.