Category filter

Started by rolandgifford, May 12, 2024, 02:29:15 PM

Previous topic - Next topic

rolandgifford

Is it possible to have a category filter which persists after shutdown and is active the next time that IMatch starts? I can't find this in the help so assume it doesn't exist, but worth asking.


BanjoTom

Maybe set up a category bookmark?  The bookmark will persist, making it easy to quickly return to the category you want to display...
— Tom, in Lexington, Kentucky, USA

rolandgifford

Quote from: BanjoTom on May 12, 2024, 03:43:12 PMMaybe set up a category bookmark?  The bookmark will persist, making it easy to quickly return to the category you want to display...

Quickly returning isn't the problem, I am returned to the correct category if I don't use a filter. I want to hide the very many categories I'm not interested in to avoid IMatch recalculating/rebuilding and the like

Mario

You mean for the Filter Panel? Or the Category View?
Both reset automatically when the database is closed.

QuoteI want to hide the very many categories I'm not interested in to avoid IMatch recalculating/rebuilding and the like
IMatch calculates categories on-demand, when they are needed.
If a category is not visible, it is not recalculated in most scenarios.

If you have very many top-level categories, try to organize them under a new top-level parent. This allows you to close the parent to hide the categories to avoid calculation.

If you have many categories expanded, close them.

If your @Keywords category is a very long list of keywords, enable the feature to automatically group them by the first or several characters (properties of @Keywords). This allows you to hide most keywords until you need them.

Categories are updated in the background when needed - until they are needed "right now".
Unless your database is massive or you use many data-driven categories or complex formula-based categories, category updates should not interfere with the IMatch user interface or cause "locks" longer than a second.

Giver me some details about your database (e.g. a log file) and some info about the situations where you encounter issues caused by categories. Category View? Category Panel?

rolandgifford

Quote from: Mario on May 12, 2024, 05:43:06 PMYou mean for the Filter Panel? Or the Category View?
Both reset automatically when the database is closed.

Startup debug log attached

I'm referring to the Category view and the Category Filter panel.

I have a permanently active Filter Panel which hides Buddy Files and Rejected Files. I sometimes expand that filtering to help with whatever I'm doing. I'm aware that this slows loading images but it is worth the delay.

The only data driven and formula based categories are those supplied with IMatch. I haven't added or deleted any. All of the day-to-day categories I use are under @Keywords. I have about 20,000 Keywords, all branches are collapsed apart from the "Waiting Sorting" branch that I use when updating and that has about 50 sub-keywords, 118,000 images currently in that branch.

My workflow is to work through the images in one of the sub-keywords of "Waiting Sorting" rejecting some images and deleting the current Keyword from the ones I may want to keep. I therefore see only the images I haven't looked at yet. At the end of this pass I add all remaining images to the keyword and go through again with the next pass.

I frequently delete rejected files and write-back metadata, about every 20 images I'm keeping sort of frequency. It is the write-back phase which causes some irritation as the process takes longer than I would like. I assume that recounting keyword counts is a part of what I'm waiting for as the categories list moves (jumps about) and I have bad experiences of clicking something which may have moved between the decision to click and actually clicking so I wait till Task manager says that IMatch is idle.

I could of course update less frequently but it appears that if I drastically reduce the number of categories shown, by using a category filter, it speeds things up. I can manually enter it every time that I start IMatch but I will want this filter every single time that I start IMatch till I've finished this trip, about 2 weeks, so the setting "sticking" as searches appear to based on some recent forum issues would help me.

Mario

IMatch logs a "long transaction" warning for the Collection Filter in the Filter Panel, which took a whopping 27328ms - that's bad. Because it also delays other operations in IMatch which require the database.

This seems to be caused by the File Properties filter, which requires "expensive" collections to recalculate.

Can you show me which File Attribute filters you have enabled, and which other filters in the Filter Panel?
Which scope (File Window contents) was active when you made this test? Did you show a folder, category, all database....?

Your database has about 230,000 files, which is a good mid-sized database. Updating the filter panel should not stall everything for almost 30 seconds!

rolandgifford

Image of filter panel attached. I only hide Buddy and Rejected files. Nothing is selected in File Properties which isn't active anyway as you can see.

The scope selected scope for this log file is a single Keyword in the categories panel. Currently that Keyword has 7021 images associated, 3498 visible after filtering. There would have been perhaps 600/300 more when I did the test. The files are primarily JPG/ARW pairs with a small number of MP4 files.

Mario

The default "Hide Buddy Files" and "Hide Rejected Files" stored filers use the "File Properties" filter.
Since you have selected two stored filters, the filter panel runs the first filter, then runs the second filter and then combines the result.

If you always keep these filters on, you should see a performance improvement by creating filter that hides buddy files and reject files, store it, and then activate this stored filter instead of these two.

rolandgifford

Quote from: Mario on May 13, 2024, 10:59:55 AMIf you always keep these filters on, you should see a performance improvement by creating filter that hides buddy files and reject files, store it, and then activate this stored filter instead of these two.

Where in the help does it tell me how to do this?

Mario

It's in the Filter Panel help in the section labelled Stored Filters
Open File Properties Filter
Tick Hide Rejected Files
Tick  Hide Buddy Files
Store under a name of your liking.

rolandgifford

No improvement from combining the two into a new Stored Filter

I have run some tests using a stopwatch rather than looking at log entries for more accurate timings and startup with the two filters (either the two standard stored filters or the new combined one) is about 53 seconds. Hide Buddy on its own is 25 seconds and Rejected on its own is 53 seconds. There are currently no rejected images to hide.

It is filtering just less than 7000 images.

Mario

I think the problem is the buddy files filter, which is very expensive.
Buddy files must not necessarily be in the database (XMP files, config files of applications etc.).
In order to find buddy files for a file, IMatch must access the file system, applying the buddy files rules. Assuming that of your 7,000 files only 1,000 match the buddy master, this means that IMatch has to perform 1,000 file system scans to look for buddy files. And that's expensive (aka slow).

Try without the buddy files filter and see if this makes a difference.

rolandgifford

Quote from: Mario on May 13, 2024, 12:59:21 PMTry without the buddy files filter and see if this makes a difference.

Turning off the Buddy filter makes no difference, it is the Rejected filter which is slow (timings in my previous post)

I've attached three debug log files, Hide Buddy on its own, Hide Rejected on its own and Hide Both

Mario

28 seconds for the sample with the reject. That very slow :-(

I've made a quick test with my largest database with almost one million files.
After loading the database, I selected a folder with 9,800 files,

I then enabled the "Hide buddies" and "Hide Rejects" filters in the File Properties filter and then enabled the filter to run it.

The time for CIMQueryAttr::Run is about 3 seconds. Not 28 seconds, as in your case. With more files in the scope and a database 4 times larger than yours.
 
Additional runs reduce runtime to about 1.5 seconds due to caching effects in the database and file system.

Your system reports 8 processor cores and 8 GB of RAM. 4.5 GB are available when IMatch starts.
This makes me think that your computer is older?
Is the C: disk containing the database a SSD or a spinning disk?

rolandgifford

Quote from: Mario on May 13, 2024, 02:19:14 PMYour system reports 8 processor cores and 8 GB of RAM. 4.5 GB are available when IMatch starts.
This makes me think that your computer is older?
Is the C: disk containing the database a SSD or a spinning disk?

It isn't especially old. BIOS date 2022
Processor   Intel(R) Core(TM) i3-10105 CPU @ 3.70GHz, 3696 Mhz, 4 Core(s), 8 Logical Processor(s)

Database and images all held on SSDs. Virus filter exclusions as suggested

If I change the scope to the entire Keywords tree (228,000 images) it doesn't greatly affect the startup time where Hide Rejected is selected on its own. Debug log attached

Mario

Then maybe it's just the situation when starting up the database. Dozens of parallel tasks are running, each trying to load data from the database, competing for database and disk resources.

rolandgifford

Quote from: Mario on May 13, 2024, 03:39:37 PMThen maybe it's just the situation when starting up the database. Dozens of parallel tasks are running, each trying to load data from the database, competing for database and disk resources.

It isn't that because the start-up delay only happens when the Hide Rejected filter is active and I experience these delays every time that I Delete Rejected/Write-back Metadata while culling.

Having an active Category filter so that I only see the Keyword I'm using appears to reduce the update/filter time (but I haven't done any timings, just an impression) in that circumstance so I was hoping that it would also improve start-up time, hence the question at the start of this thread.

I'm happy to accept that you can't reproduce this and therefore can't fix it. I'll continue to live with the irritant, it isn't any sort of show stopper.

Mario

Updating the rating collection is not special in any way.
Basically IMatch asks the database to group files based on the rating tag. Even for 230K files this is fast, unless the collection has to wait for something else to finish.

I can only base my analysis on what I see in the log file and then try to reproduce this here. Which i could so far not, even with my largest database. I will try a few more things but so far this is all can say so far.

rolandgifford

Quote from: Mario on May 13, 2024, 04:47:32 PMI can only base my analysis on what I see in the log file and then try to reproduce this here. Which i could so far not, even with my largest database. I will try a few more things but so far this is all can say so far.

I have written software for most of my working life so am very aware of the "can't reproduce/can't fix" pair and I'm sort of happy that this particular issue falls into that bucket.

My guess would be some sort of operating system call (network?) which returns immediately on your system but has to time out on mine. But I can't imagine any reason why filtering out Rejected Files would want to do that.

Mario

#19
Filtering for rejects is database-only. Only buddy files require file system operations.

I've tried to reproduced this again with a normal 150K database, my own database 400K and my largest test database 1000K files, on both my PC and my laptop.

For the largest database, the execution time for CIMQueryAttr::Run  was 1350 ms (laptop only 1100 ms because of super-fast SSD) when starting the one million files database and applying the two filters in the Filter Panel for a folder containing almost 10,000 files.

So this is not what's causing the long delay during startup you get on your system.
Try to hide or close (save your workspace before) all "expensive" panels except the Filter Panel: Category Panel, File Categories Panel (I've had only the Metadata Panel and Keywords Panel visible in my test). If the Category and/or Collection filters are visible in the filter panel, close them so they are not loaded initially.

Maybe this makes things faster. No idea, really. I don't see anything else running when the Filter Panel runs the two file properties filters in the background, and the update of the rejected collection is the only thing that's so slow. The file system will have none of the database cached yet, and the database system will have to read every page "cold" from the SSD.

Run a database Compact & Optimize (Database menu > Tools).

Double check virus checker exclusion for the "C:\IMatch Database\" folder and 'C:\Program Files\photools.com\imatch6\IMatch2023x64.exe' executable.

With such a rare case and no repro here, I did what I could. Moving on to the next user issue.

rolandgifford

I have a virus exception for C:\ProgramData\photools.com\imatch6\config which, given my hatred for folder exclusions due to the risk, is not the sort of thing I would have added unprompted. Should I remove this exclusion? I have exiftool.exe listed as an exclusion as well.

I hide and close already as you suggest with the possible exception that I have the "Waiting Sorting" branch of my Keywords expanded which requires counting 100,000 images over 50 Keywords. If I use a Category filter and close-down, the categories tree is fully expanded on start up requiring 20,000 category totals to be calculated, one of the reasons I was asking it to stick through shutdown/start-up.

Changing the scope on my database from 6000 images to 200000 images doesn't make a significant difference to start-up time so I expect that there is some fixed length/timeout delay involved. I also expect that it is unique to my system and it will hopefully go away following a Windows Update or whatever. It is certainly not an issue I would want you to spend any more time on.

I Compact & Optimize regularly


Mario

C:\ProgramData\photools.com\imatch6\config  is the folder that contains configuration files and the settings database. Unless your AV blocks IMatch's access to the folder, you don't need that exclusion. It is also irrelevant for performance.


Collapse @Keywords in the Category View.
Close the Categories Panel.
Close the Filter Panel.
(Save your workspace before for convenience)
Switch to the Media and Folders View. 
Close and reopen IMatch.
Any difference?
Updating @Keywords from 250,000 files or even only for 100,000 files for 50 keywords can be a drag.

rolandgifford

Closing the Filter Panel makes start-up very fast as there is no filter

Watching IMatch srart (with a filter active ) I notice that the collapsed Collections bar in the filter panel flashes a couple of times just before everything is finished.

In the Media & Folders panel the icon for most folders are dull rather than bright. There isn't an obvious pattern which are dull and which bright. What does this mean?

Mario

#23
Quotenotice that the collapsed Collections bar in the filter panel
It will receive notifications about modified collections as long as its loaded.
What happens when you close (not only collapse, remove from Filter Panel) the Collections  panel and do the startup routine. Faster? I don't think so, but while chasing this, know knows?

QuoteIn the Media & Folders panel the icon for most folders are dull rather than bright. There isn't an obvious pattern which are dull and which bright. What does this mean?
No idea what you mean. Show us a screen shot. Do you have experimental mode enabled and thus use the new icons?