Database Node Vs. @All

Started by Darius1968, March 02, 2020, 09:13:28 PM

Previous topic - Next topic

Darius1968

I've attached a log file that represents single-term searches (Frequently Used Tags, Advanced Search, Ignored Diacritics), which were performed on two scopes of files in the file window (The Database Node of Media & Folders, and the @All Category).  I'm hoping that file can identify the cause of the searches taking roughly 30 sec. in Media & Folders, while those exact same searches in the Categories Tab (@All) cost just a few seconds at most. 

As you will see, in the case of searching under @All, I added at the end, a few more search terms to show that the computational time remained constant, given 'new' information to process. 

What follows is the search terms (numbers enclosed in parenthesis are the # files found) for each case (Database Node, @All Category): 
Database Node Search Terms:   
Fish   (399)
Paris   (114)
Miami   (726)
Denver   (2)
Frankfurt   (2)
Toronto   (85)
   
@All Node Search Terms:     
Fish   (399)
Paris   (114)
Miami   (726)
Denver   (2)
Frankfurt   (2)
Toronto   (85)
Montreal   (7)
Calcutta   (0)
Minnesota   (12)
Philadelphia   (12)
Champagne   (10)

Mario

[7531ms] CMDQuery::Run fish (cold, query cache for current scope not loaded)
[1828ms] fish (with query cache)
[1907ms] CMDQuery::Run paris
[1937ms] CMDQuery::Run miami
[1953ms] CMDQuery::Run denver
...

1,178,548 data elements to search, 32 tags
405,010 files in scope (!)

IMatch has to search about 1.2 million data fields to find your search term.
This takes 7.5 seconds when the high-speed IMatch query cache has not been filled yet. Else about 1.9 seconds.

You are working with a scope of 400, 000 files!
This means that the File Window has to process 400K files, remove files not matching the search result, deal with versions, file relations, stacks etc. and then load the final result into the File Window.
This is what takes long for 400,000 files. Not the actual search. 30 seconds for scopes with 400,000 files.
Your file window layout, file relation setup, maybe activated filters etc. also affects this.

Darius1968

#2
Okay, but what I still don't understand is the difference between the Database Node and the @All Node.  It's the same exact files that are considered in both cases, but the time to perform the search is shorter when the files are loaded into the file window via @All, and that's what I don't understand and would like clarified. 

sinus

Quote from: Darius1968 on March 03, 2020, 09:07:17 AM
Okay, but what I still don't understand is the difference between the Database Node and the @All Node.  It's the same exact files that are considered in both cases, but the time to perform the search is shorter when the files are loaded into the file window via @All, and that's what I don't understand and would like clarified.

Here the time is equal for searching.
For correct results you must use the same windows layout, file relations, stacked (open or not) and maybe other things.

AND after every search you should close IMatch and reopen (cache).

I have tested searching quite some times ago, but found out, that the times are equal, but it was not easy to search with the exact same basics.
Best wishes from Switzerland! :-)
Markus

Mario

I can see timing no difference for searching in the log file!
It's either 8 seconds (cache needs to be filled) or 2 seconds for each search.

When IMatch takes much longer to load the File Window in the Media & Folders View, check your settings.
For example, try to set the "Don't Group by Folder" option (File Window toolbar, Hierarchical View menu).
Your database has 10665 folders, and when you select the Database node and this is on, IMatch has to process the files in each folder individually, not all files in the database at once like @All does.
This will not affect the search, but how long the File Window needs to arrange and layout the files.

Darius1968

Okay, Markus & Mario:

Markus, I can say that the File Window Layout I use, is the same for both cases, Media & Folders and Categories.  Mario, I do indeed have activated the option to not group by folders, when I'm in Media & Folders. 

Considering the ground rules that I've laid out above, I've just now verified that if I choose @All (Categories), the File Window is immediately paused (dictated by, "Pause if more than 10,000 files"), and when I dismiss the pause, it takes just 9 sec. to render all 400,000 files.  If on the other hand, I choose the Database Node (Media & Folders), the File Window's Pause button kicks in, but only after roughly 30 sec. or so.  Then, when I dismiss it, the time to render all 400,000 files in this case, is about another 35 sec.!  But, as I've already said, "Don't group by folders" is in effect.  So, any other ideas of what could be causing this?  I'm attaching another log file here, for your evaluation of what I've done.  Thanks! 


Mario

No further ideas at this time.
I recommend not to process 400,000 files all the time or force the file window to load that many files.
Or just use @All if that's faster on your machine.

Please understand that I'm currently much more interested in fixing IMatch 2020 bug reports and feature requests piling up than analyzing performance issues for one user who works with scopes of 400,000 files (which is more than many 'DAM' systems out there can even handle). We maybe revisit this in a couple of weeks, unless it solves itself.

Darius1968

Although my database has that many images, I seldom load that wide of a scope in my file window.  I was just 'testing', comparing performance to other benchmarks that I've seen on this forum. 

Mario

I can analyze only the log files you have included.
And there you have used 400,000 files in the scope. And the File Window takes 30 seconds to load all these files.
Search time is < 2 seconds for 1.2 million searched metadata tag values.