Grouping numeric values in data-driven categories

Started by anmue, May 30, 2014, 09:05:10 PM

Previous topic - Next topic

anmue

Hi,

I looked at the different options of the data-driven categories and I wonder if it is possible to group numeric values, e.g. the ISO values. I only found the 'auto-group' feature, but as this works on the literal string it would group for example 10, 100,  133, 1200, and 1600 under the 1 etc. For numerical values I would prefer to have groups that summerize a numerical range like all between 100 and 200 or all less then 50 or all greater than 3200.

How can I achive this kind of nummerical grouping?

Regards Andreas.

Erik

When I wanted to try to do something similar I think I found I needed to use multiple Data Driven Categories under root and then I used the Replace and Filter option and entered a range of values to filter for...

Thus, if I was going to create a category for ISO = 100 to 800:  I would use the Numeric Range option as 100,800.

I would then create a similar category as a sibling and change the numerical range, repeating the process for the number of ranges you'd want.

This could get tedious if you want a lot of categories, but it wouldn't be too tough if you are looking to use 5 or so.  And, if you like it, you could export the category scheme, so you wouldn't have to recreate them again.

jch2103

#2
Note that you can include multiple ranges. From the Help:

QuoteNumeric Ranges If you force the data type for the values to a numeric value (see Data Type above) you can specify one or more numerical ranges to include only certain values. The format is: lower,upper. You can specify multiple ranges using ; as a separator.

That said, I'm having problems getting the numeric range grouping to work myself... :(
John

Erik

Quote from: jch2103 on May 31, 2014, 12:20:49 AM
Note that you can include multiple ranges. From the Help:

QuoteNumeric Ranges If you force the data type for the values to a numeric value (see Data Type above) you can specify one or more numerical ranges to include only certain values. The format is: lower,upper. You can specify multiple ranges using ; as a separator.

That said, I'm having problems getting the numeric range grouping to work myself... :(

Including the multiple numerical ranges includes each range into one filter.  So, if I were to enter 100,200;800,1600:  All the images with ISO 100-200 and 800-1600 would be filed under the parent category, but, the images not in the range (e.g. 200-800 and those less than 100 and more than 1600) would not be included.

Because the numeric fields seem to get "Grouped" alphabetically, it seems one has to essentially manually set the ranges in categories one by one. 


Mario

Quote from: jch2103 on May 31, 2014, 12:20:49 AM
That said, I'm having problems getting the numeric range grouping to work myself... :(
I have fixed a number of issues with numeric ranges for data-driven categories for the next build.

See #2414 for details.

jch2103

John

anmue

Hi,

At first I will thank you for all the ideas, especially Erik, who pointed me in the right direction. There are some things that I encountered, when I tried Eriks approach:

1) After I have entered a range (e.g.: 0-99) and I then reopended the data-driven settings. The range in the dialog has changed to "0(null)319450352". What's that?

2) The ranges could not have open endpoints. But sometimes it would be nice to express a range like "all less than 100". A new feature to IMatch would be to define this as ",100".

3) With Eriks approach I would get all the distinguished ISOs in sub categories of a given range. But I don't care much about how many pictures have ISO 73 or ISO 74 etc. So I expected having no such subcategories, but this point is a matter of taste.

4) While adding my ISO categories IMatch suddenly stuck and I had to use the taskmanager to close it. Unfortunately I don't remember what I was doing in detail when this occured >:(. I attach the log file here.

Kind regards

Andreas.

[attachment deleted by admin]

Mario

#7
Quote from: anmue on June 01, 2014, 08:43:59 AM
1) After I have entered a range (e.g.: 0-99) and I then reopended the data-driven settings. The range in the dialog has changed to "0(null)319450352". What's that?

Please download the current version of IMatch, which includes a bug fix for this and for several other issues with the numeric ranges.

The log file shows a normal IMatch shut-down. It ends when IMatch is displaying the "Backup Reminder" dialog. For me this looks as if IMatch was shutting down, presenting Backup Reminder dialog to the user and was then shut-down hard via the TaskManager...?

anmue

Hi Mario,

The issue I mentioned under 1) is fixed with version 160.

As I said I was a little bit unconcentrated as IMatch stuck, so I can't say why. Maybe the problem was in front of the screen ;-).

Regards,

Andreas.

Erik

Quote from: anmue on June 01, 2014, 08:43:59 AM
Hi,

2) The ranges could not have open endpoints. But sometimes it would be nice to express a range like "all less than 100". A new feature to IMatch would be to define this as ",100".

3) With Eriks approach I would get all the distinguished ISOs in sub categories of a given range. But I don't care much about how many pictures have ISO 73 or ISO 74 etc. So I expected having no such subcategories, but this point is a matter of taste.


From my own past experience, My "named" categories would say less than 100, but the actual range would capture 0-99 (or whatever you put in there).  My upper range, I would just put a really big number in at the top of the range and name it as you wish.

As for the actual categories, that's just how Data Driven Categories work.  However, if you read through the help file and the options, you can use some of the other replace and filter methods to unify some values.  For instance, if you have files with ISO 73 and 74 as you mention, you could just replace 73 every time it comes up with a 74, and then they'd all get filed under 74 (I think). 

I've done something like that for Data Driven Categories I've used to identify lens and camera since firm wear updates and various software have tweaked the names of lenses over time even when the same lens was used. 

The Data Driven Categories as they are now are perhaps my favorite feature of IMatch.  They kept me with the program in IM3.6 and they are about as perfect as they can be now.

anmue

Hi Erik,

thanks for your hints. I found that it is not possible to use a regex filter with a numeric data type. If I force the data type to be "Text" then I can use the filter, but then I have to express the range in terms of a regular expression. Hence ranges only work with nummeric data types ;-)

Finally I managed it that all different ISO values of a range are replaced with a single value. But a slight glitch remains: Now I have a single sub category containing this representative value. I.e.:

ISO
+--- < 100
            +---- 99
+--- 100 - 199
            +---- 199
+--- 200 - 399
            +---- 399
...

instead of

ISO
+--- < 100
+--- 100 - 199
+--- 200 - 399

BTW: I think ranges without an endpoint have a subtle advantage over using a very big or very small value as pseudo endpoint. I do not have to care about the range of values a numeric image property would have. Not now nor in the future. In computer programming it is always a nice source of errors, if you rely on a value from which you think that it will never ;-) be reached.

Regards Andreas

Erik

I don't think you'll be able to remove the subcategory, but you don't really have to expand it either, right?  If you don't expand the tree, it will look like what you are after.  On an earlier beta, I had actually set up a tree like yours, but just chose not to care about the leaves (didn't even try the replace item). 

I'm not sure about the regular expression stuff.  I know that when it comes to numbers, they are just characters to regex.  I've seen them made to work well with numbers, it just requires a bit more effort.  I don't think you'd really need it as you've found.  I've never really done much with them myself.

Last, as for the endpoints.  From my own programming work in engineering, we find endpoints to be nice, but we always have "Elses" to capture errors when we are out of range. 

With ISO, I think the key would just be to set up a formula category to capture files that are not under the data driven category (and perhaps color code it, so when something ends up there, you'll know it).

Something like:  "@All" NOT "ISO"  (You might have to experiment to make sure it works properly).