Updating Index ... must it be automatic?

Started by Gerd, September 11, 2013, 01:03:57 PM

Previous topic - Next topic

Gerd

... or is there a setting for manual start of this?

What happens? I tried to delete some unnecessary categories, after deleting the first one ... "Updating Index ... 1h 25 min" ... o, I was not sure, if it has to do with rest-activities from importing categories or deleting the category ... so I waited 2h ...

Now I marked some pics in a category, to check, where they are assignmed to ... I decided to delet the category, because, i did not use it any longer. Now I forgot, that the white marked category was not the category itself and pressed <Del>  ???  ... Uuups, yes I remebered ... all pics (roundabout 12.000) are marked as deleted ... normally no problem to unmark them .. but again: "Updating Index ... 1h 25 min" ...

In my Excel-sheets I have it also sometimes with formulas. Sometimes I have to use special formulas for 450.000 cells. But in Excel I have the possibility to set the calculation to "manual". So I can quick build, insert and copy my formulas and at the end I press the button to recalc and go to coffee-corner ...

Is such possibility in IM5 conceivable?

An interesting side-effect: at the moment, when IM5 shows me, that updating index remains 1h, I closed IM5 and started again. But after this new start, updating index needs only 7 min.... Is there an explanation for?


Regards
Gerd
_______
Regards
Gerd

Mario

Updating the index is normally an operation which runs in the background. And the index monitors the IMatch user interface and pauses when the user is busy doing other things. Does it block the user interface with a dialog box or why did you have to wait?

A typical index update after changing some files (e.g. a rating or label update) takes about 0.1 seconds...
How long takes it on your machine?

Many features in IMatch depend on the search engine index, e.g. filters, some data-driven categories, the search bar in the file window, the search features in the Metadata and Keyword panel etc.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Gerd

Hi Mario,

it blocks ... for several seconds till one minute IM5 is totally busy and blocked (white windows, IM5-but no reaction), also other actions in Windows are delayed.

To make all changes first (in a temporary space) and then start an "Update to database" would be very helpful ... maybe to set an internal marker for these changes, that remainds by leaving the activ window, ending IM5 or after a new start, that there is a pending update waiting ...

May be as selectable function in the preferences?

Regards
Gerd
_______
Regards
Gerd

Mario

Application logfile, please?

I need to see what IMatch is doing and what takes how long.
I assume this is again your 130,000 files database?

You know that we are working with a Beta version and that performance tuning is an ongoing task...
If Windows also feels "slow" the hard disk is the problem. It cannot read/write the data fast enough. I assume you are using a local hark disk for your database, not a NAS or remote server or external disk?
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Richard

Hi Gerd,

I recently bought a new computer but I also bought a 32 GB USB-3 flash drive to put my database on. The difference in speed between my old and new computers is hard to believe. My guess is that investing in a 32 GB USB-3 flash drive would do wonders for you. Mine is a Patriot and has very good specifications. It was not easy to find a site that included specifications but I managed. While all flash drives are going to be solid state, some are not much faster than a hard drive so be careful.

Gerd

Hi Mario,

here it comes attached! The last actions where in Rating - Reject to mark all files (round about 300) and delete them. IM5 stops by the last 5 with a message "Do you want to delete read only" and I had to confirm ... IM5 stops at the last one, if I clicked, I got a peep-signal ... nothing was possible ... I saw a disk-symbol with an exclamation mark, but could do nothing ... so I killed the process and restarted IM. From that point is the log-file saved. After restart, no message fro IM, all seems to work ok.
The last pic was still visible in the ratingg-reject-view with this disk symbol, but now I got the info displayd, if I pointed the cursor to it, that this file is off-line and I should rescan the folder.
In the folder view was nothing to see, so I rescanned the folder and the pic is gone from the rating-reject-view.

I did now a diagnostic-run and got a warning, the entry in the log-file says nothing to me, I have also attached this log-file after diagnostic.

Regards
Gerd

[attachment deleted by admin]
_______
Regards
Gerd

Gerd

Hi Richard,

at moment no money to invest in new hardware .... I must use, what I have ...

Regards
Gerd
_______
Regards
Gerd

Richard

After investing in a new computer I too am short of money but I should have mentioned that good flash drives are just over a dollar per GB. Cheaper than a carton of cigarettes and easier to justify.  ;D

Gerd

Hi Richard,

... a dollar per GB ... hmm, I need 2 TB, that are 2.000 x 1 GB = 2.000 $  :o

and I never heard about such big flash-drives ...

Regards
Gerd
_______
Regards
Gerd

Gerd

Hi Mario,

after a second kill and restart the diagnostics shows me no errors or warnings. I found roundabout 30.000 pics with pendig metadata, I selcted them all and flagged them with a yellow-flag,
now the IM is busy ... in ProcessLasso I can see, that IM takes only 5 to 10 % CPU-time, less then by updating index, there IM use 25%.
And only IM is busy, no Exiftool ... I have attached, what IM shows me ... I know, I have to wait ...

Regards
Gerd

[attachment deleted by admin]
_______
Regards
Gerd

jch2103

Storage prices keep coming down. I don't know if you're running a laptop or a desktop. If a desktop, a 2TB internal drive can be had on Amazon for about US$90. I recently got a USB3 card for my 5-year old desktop for about $20 and a 2TB portable drive that's now about $120. Much, much faster than USB2.
If you're running a laptop, your options are more limited unless it came with USB3. If you don't mind popping the case, replacing the internal disk with a new hard disk or SSD could be a (more expensive) option.
John

cytochrome

I agree with Gerd: something has to be done to render IMatch usable...

I have a lot of  fun testing it, it is witty, intelligent, full of possibilities, but at this time I cannot use it to catalog my photos. With a low to medium size databank (under 50000 files), it is almost always active and does not often give the hand.

One hour ago I added 8 NEF and 5 jpg, innocently really, no provocation, no bad word, all polite, did not ask for anything. Et voila, it is still doing its thing (whatever it is). Just now it says Updating index 50%....

When this is over, I bet it will start something else as soon as I click on some files...

I think it is the background activity that goes wild, out of control. It decides at any time that now it has to do this or that, and it becomes almost impossible to get some work done. Just now I got the message that all is done, but the CPU is still at 80-100%.

It is frustrating. I am at the root of the D80 files, small folder with just 2 files, and still it does something. Well it is calming down.

What would be really nice is a button to switch off all background housekeeping activity so one can enter assign categories and fill some metadata tags. And then, while I sleep, let IM do its thing.

I foresee that whatever the price of iM5, it will cost me some real money!! New computer, i7 (at least !!), solid state drives, much more memory (only 4gb now), second monitor to deploy all the nice panels.

Sorry for the rant, but some days it is unerving, an afternoon to update the data in 2 folders..

Francis


Gerd

Hi,

IM5 is back in the life! I started round 18:30 with adding the yellow-flag and now 19:17 it is finished! I have had the hope, that also my 33.215 pendig metadata would have been written, but that was only a hope ... still pending ...
I,m still wondering, why there are pending metadata by these pics ... I have not changed metadata ...
So I will start now the challenge to rewrite all pending metadata ... just started! At moment IM5 is now busy and not usable, it is not a background process with 2 to 16% cpu-time ...
The icon in the taskbar shows me, how far the process is and it looks like, that it will take some time ... Uuups, it's ready, but no changes are written   ...

log-file attached!

May be better to create a problem-post?

Regards
Gerd



[attachment deleted by admin]
_______
Regards
Gerd

Mario

I have not yet had time to analyze your log files (working at the bottom of the bug report list right now).

But so much:

The yellow triangle indicates off-line files. Files which are in the database but not found on disk. If this is what you mean with 'yellow flag'.

QuoteI,m still wondering, why there are pending metadata by these pics ... I have not changed metadata ...

If you make IMatch map hierarchical keywords to flat keywords on import (Edit > Preferences > Metadata) this can cause new keywords to be created and result in write-backs. Or maybe you run a metadata template on import?

Tip: When you point the mouse cursor at the pen, IMatch will tell you the metadata tags which have been updated.


-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Gerd

Hi Mario,

could not look in IM during that process, but I mean a yellow label!

Now it becomes more curious: started 19:23 with marking only 5 of the 33.210 files and selected commands - Metadata Write Back - For selected Files ...
After drinking coffee and back to my computer at 19:56, IM5 was back in life .... and nearly all of the round 33.000 files are disappeared  from Collections - Pending Metadata Write Back, only two where left ... may be it's a background process? Again a coffee and back at 20:23 ... no background process, still 2 pics are pending ...
In the attached screenshot the two are marked and you see the yellow label with 33.210 pics.
Switching to to the view of the yellow-lables shows all, as far as I scrolled now, again with the yellow pencils as unwritten metadata ...

Regards
Gerd

[attachment deleted by admin]
_______
Regards
Gerd

Richard

QuoteI never heard about such big flash-drives

When you get into that size it would have to be a SSD and they are expensive but not $2000. I find it hard to believe that you have a 2 TB database. That would be enough to choke any computer.

Gerd

Hi Richard,

the database only is 8,2 GB, put the pics ... they are located on an 1TB ext. hard-disk, connected via USB 2.0 (it's the only possibility) and if IM5 is activ, I can also see, that the activ-indicator from the hard-disk is flashing. My notebook is 650 GB ...

Regards
Gerd
_______
Regards
Gerd

Richard

Hi Gerd,

Hopefully one of the folks more knowledgeable will reply but my understanding is that it is the database that needs a fast drive the most.

Another thought is that your 2 TB drive might be fragmented. That could slow things up if part of an image file is separated from and part and the head has to travel a lot.

Richard

Another thought. Is your database on the internal drive? It should be on the fastest drive available. I am guessing here but I believe it is best if image files are not on the same drive as the database. It would seem to me that reading on one drive and writing on another would be faster than reading and writing on the same drive.

jch2103

#19
Quote from: Gerd on September 11, 2013, 09:00:26 PM
the database only is 8,2 GB, put the pics ... they are located on an 1TB ext. hard-disk, connected via USB 2.0 (it's the only possibility) and if IM5 is activ, I can also see, that the activ-indicator from the hard-disk is flashing. My notebook is 650 GB ...

To supplement Richard's comments:

1. I assume you're following the recommendations in the Help file under 'The IMatch Database', especially the part about virus checkers.

2. I suspect your USB 2 setup is a bottleneck, especially with your large database.
John

Gerd

#20
Hi Richard,

just checked defragmentation ... 0,39% is defragmented, no action required. But at moment, I have to use, what I have ...

Yes ... I know that my USB is the bottle-neck ... but with IM3.6 I never had these waiting-periods ...

Regards
Gerd
_______
Regards
Gerd

Richard

Quotewith IM3.6 I never had these waiting-periods

Sure but 3.6 doesn't do anywhere near as much with metadata. Another factor is that Mario had time to fine tune 3.6 for speed. One of the things I had noticed is that Mario will often improve the speed while he is fixing bugs in an area.  I am sure that the IMatch 5 that gets released to the public will be a whole lot faster than the version that began Beta testing.

Mario

Quote from: Gerd on September 11, 2013, 10:29:24 PM
Yes ... I know that my USB is the bottle-neck ... but with IM3.6 I never had these waiting-periods ...

When all files are read in and processed and the index is up-to-date the performance you achieve in IMatch 5 should be as good or even better than in IMatch 5 - although IMatch 5 has to do a big lot of more work to make all the additional features work. Tip: Closing some panels will also speed up things.

Using an USB 2.0 disk for the database is a bad idea. It will cut the performance at least in half. For disk-intensive tasks like search engine updates or rebuilds, it will even be 3 to 5 times slower than on a fairly modern internal hard disk. Only the database needs to be on the fast disk, not the images. Which is why I gave you the tip about investing maybe 50 € for an USB 3.0 interface card and an USB 3.0 32 GB stick. A lot faster than your built-in disk and maybe 20 to 50 times as fast as an USB 2.0 disk.

You are combining a rather slow hard disk with a very large IMatch 5 database (8 GB database file size, 130,000+ files). This creates probably a worst-case scenario. I like your reports because it shows me where performance is lacking. But I regularly test with a 120,000 files test database and I don't see execution times nearly as bad as you see them. But I run the database on my built-in RAID disk or my USB 3.0 high-speed USB stick. That's the difference.

Please do this:

Restart IMatch.
Under Help > Support make sure that Debug logging is on
Change the rating or label for some of your files (maybe 10 files at once, then 50 files, then 100 files)
Go to Help > Support again and save the log file to your disk.
Attach this log file.

This will give me some important info about how slow the index is on your system, and which parts. Maybe I can tune it a bit.

Tip: Set the log file to Normal logging again afterwards. This produces smaller files and much less disk activity while you are working with IMatch 5.

-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Gerd

Hi Mario,

I did, what you mentioned. I did in steps with 10, 50, 100, 500, 1000 and 5000 pics. The first 5 have been selected from top of the list, the last was a selection somewhere inbetween the 30.000 yellow label pics.
The first 5 have been processed quite normal, but the last one shows me several "white screens".

Regards
Gerd

[attachment deleted by admin]
_______
Regards
Gerd

dcb

Quote from: Richard on September 11, 2013, 10:03:34 PM
Another thought. Is your database on the internal drive? It should be on the fastest drive available. I am guessing here but I believe it is best if image files are not on the same drive as the database. It would seem to me that reading on one drive and writing on another would be faster than reading and writing on the same drive.

Just found another reason to do this other than speed. My external USB 3.0 drive is known to drop out. Just did it after a large database update. Took the database out. That's why I back up and am now moving the database off that drive. With IM3 it was never a problem.
Have you backed up your photos today?

Mario

QuoteMy external USB 3.0 drive is known to drop out.

"Drop out" means your driven just disconnects?
This should never happen because it can cause a lot of damage. Its a picture book case for "how to lose data".
When the drive drops off before Windows can write the disk cache, the data will be lost forever. And this can cause problems now, or at some later time. Not only with IMatch databases, but with all files and the file systems itself.

I would suggest to exchange the drive. Or at least try another (short) cable. Many problems with USB 3.0 are caused by cheap or too long cables!

-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

dcb

Quote from: Mario on December 24, 2013, 10:36:55 AM
QuoteMy external USB 3.0 drive is known to drop out.

"Drop out" means your driven just disconnects?
This should never happen because it can cause a lot of damage. Its a picture book case for "how to lose data".
When the drive drops off before Windows can write the disk cache, the data will be lost forever. And this can cause problems now, or at some later time. Not only with IMatch databases, but with all files and the file systems itself.

I would suggest to exchange the drive. Or at least try another (short) cable. Many problems with USB 3.0 are caused by cheap or too long cables!

Yes, disconnect/disappear. I agree absolutely with your comments. Have been searching for months and have been unable to find the cause. Plenty of backups to compensate at the moment. It only occurs when plugged into the USB3 card which I suspect is the issue. Plugged into USB2 it's flawless but much slower. Now that I've moved my IM5 database to the internal drive, I'll shift it back to USB2 and see if the performance is still ok.
Have you backed up your photos today?

DigPeter

#27
Quote from: cytochrome on September 11, 2013, 07:01:17 PM
I agree with Gerd: something has to be done to render IMatch usable...

I have a lot of  fun testing it, it is witty, intelligent, full of possibilities, but at this time I cannot use it to catalog my photos. With a low to medium size databank (under 50000 files), it is almost always active and does not often give the hand.

One hour ago I added 8 NEF and 5 jpg, innocently really, no provocation, no bad word, all polite, did not ask for anything. Et voila, it is still doing its thing (whatever it is). Just now it says Updating index 50%....

When this is over, I bet it will start something else as soon as I click on some files...

I think it is the background activity that goes wild, out of control. It decides at any time that now it has to do this or that, and it becomes almost impossible to get some work done. Just now I got the message that all is done, but the CPU is still at 80-100%.

It is frustrating. I am at the root of the D80 files, small folder with just 2 files, and still it does something. Well it is calming down.

What would be really nice is a button to switch off all background housekeeping activity so one can enter assign categories and fill some metadata tags. And then, while I sleep, let IM do its thing.

I foresee that whatever the price of iM5, it will cost me some real money!! New computer, i7 (at least !!), solid state drives, much more memory (only 4gb now), second monitor to deploy all the nice panels.

Sorry for the rant, but some days it is unerving, an afternoon to update the data in 2 folders..

Francis

I am most sad to have to agree with the above.  Despite its superb facilities, IM5 is for me unworkable.  I have recently created, for the first time (in build 1.30), a full database of some 35000 images.  I am having the same problems as described in other posts.  Almost permanent "background" metadata reading/writing and index updating, but in fact not entirely background" as IM5 is often completely unresponsive for minutes at a time.  This is even when I have deselected all the automatic activities I can find.  I do not think that is equipment related.  I have a reasonbly powerful computer with 6GB memory and 500GB HDD.  The database and files are in separate folders on the internal drive.

Ferdinand

Quote from: DigPeter on December 31, 2013, 03:03:08 PM
I am most sad to have to agree with the above.  Despite its superb facilities, IM5 is for me unworkable.  I have recently created, for the first time (in build 1.30), a full database of some 35000 images.  I am having the same problems as described in other posts.  Almost permanent "background" metadata reading/writing and index updating, but in fact not entirely background" as IM5 is often completely unresponsive for minutes at a time.  This is even when I have deselected all the automatic activities I can find.  I do not think that is equipment related.  I have a reasonbly powerful computer with 6GB memory and 500GB HDD.  The database and files are in separate folders on the internal drive.

I have a strong suspicion that this is caused by a combination of the keywords that you wrote to you files in Imatch 3.6 (using a script) plus your metadata preferences in IMatch 5.  It's possible that you've set IMatch 5 to read keywords from your files and you've also set IMatch 5 to write both flat and hierarchical keywords to your files.  In this case, if what you read in from IPTC keywords (or supp cats or whatever) is only part of what IMatch 5 thinks should be there, then it's going to want to write out the rest out to the files.  All of them.

This was precisely the point of that keywords migration script that I wrote.  You write all of it there in advance to avoid this problem. 

But for all this to work, you need to understand what's already in your files and you need to understand how to configure IMatch 5 metadata preferences accordingly to suit your own files.  When the 2014 lockout bug is fixed, post a screen grab of your metadata preferencs plus a typical file with pre-existing keywords.

It's possible that this also explains Francis' problem, except that updating the index is something that seems to happen to all converted databases immediately after conversion.

DigPeter

Quote from: Ferdinand on January 01, 2014, 11:31:58 AM
When the 2014 lockout bug is fixed, post a screen grab of your metadata preferencs plus a typical file with pre-existing keywords.
Ferdinand - thank you.  v132 now installed and 2014 lock is off.  Happy New Year.

The attached zip contains 2 images.  They both have DC subject flat keywords and LR hierarchical keywords.  The older file also has supplemental categories.  I used your scrip to write the hierarchical keywords and to produce the thesaurus.

The zip also contains a screen shot of my metadata prefs. 

Metadata 2 is set to default - should I set Alllow create IPTC/EXIF/GPS to 'Yes'?

In Background processing I set Background indexing and Writeback metadata to 'Off'

Thanks for your interest, again.

Peter



[attachment deleted by admin]

cytochrome

Quote from: DigPeter on December 31, 2013, 03:03:08 PM
..... I have a reasonbly powerful computer with 6GB memory and 500GB HDD.  The database and files are in separate folders on the internal drive.

Hello Peter and Ferdinand,

Three weeks ago I changed my 2006 Dell (Duo core, 4 Gb  Ram) for a more up to date machine (i7, 16 Gb Ram, SSD etc..) and since I had not a single hang or crash (knock on wood). There are still some white screens when importing big chunks of photos or starting IM, but never over 20-30 sec, and IM and the rest of the PC keeps working fine while IM does its back-ground thing. It is at least really back-ground...

I suspected this and balanced between switching to another DAM or buy a new PC. And it is a sad conclusion: IM "can" run on old PCs but it is no joy and quite unnerving at times. Each crash costed a lot of time, particularly running the Database diagnostics, it was frustrating. I hope Mario has some performance tweaking tricks in his sleeves for the final release..

Francis

DigPeter

Quote from: cytochrome on January 02, 2014, 11:46:30 AM
Three weeks ago I changed my 2006 Dell (Duo core, 4 Gb  Ram) for a more up to date machine (i7, 16 Gb Ram, SSD etc..)
Francis - that is some machine.  I do not aspire to that, so must hope that there is a procedural software solution.  Meanwhile, I am continuing to use IM3 happily.

cytochrome

It is not just IM, DxO also (that I start to use because ASP-Corel development seems stalled) requires a lot of memory and CPU power to use the newest denoise treatment.

Lets hope next IM releases handle better the for-ground/back-ground thing. I had learned to ABSOLUTELY NOT click anywhere in the screen while IM was reading/writing metadata and updating the index ==> fast track to hang in an infinite loop.

Surprisingly I could do quite a lot of work outside from IM without any harm. But it is irritating.

Francis

icepic

I'm also facing same issue. I use an internal 7200 RPM HDD, having a 2.4 GB database.

The IM goes non-responsive every now and then, sometimes because of updating index sometimes something else. I would understand that dealing with such large files is not an easy task, yet the user experience is terrible. It would be great, if there was an option to interrupt/cancel background tasks.

Mario

#34
Quote from: DigPeter on January 02, 2014, 12:13:45 PM
Quote from: cytochrome on January 02, 2014, 11:46:30 AM
Three weeks ago I changed my 2006 Dell (Duo core, 4 Gb  Ram) for a more up to date machine (i7, 16 Gb Ram, SSD etc..)
Francis - that is some machine.  I do not aspire to that, so must hope that there is a procedural software solution.  Meanwhile, I am continuing to use IM3 happily.

I develop and test IMatch on a four year old PC with four processors, 8 GB RAM (upgraded from 2 GB to 4 and then to 8 recently) and medium-fast hard disks. I test on a very old laptop, and in Virtual machines which have only 1 processor and 1 GB RAM. And these systems are really, really slow.
I have no SSD's, but I have a high-speed USB 3.0 adapter and some high-speed USB 3.0 32 GB sticks. Using these sticks to store IMatch 5 databases results in much better performance!


The performance of the computer is never responsible for a crash. It may be a specific timing issue that causes a crash which happens only on slower (or faster!!!) PC's, but that's another issue.

What users often forget is that the PC and the disk is utilized to 100% when IMatch ingests files, extracts thumbnails, extracts metadata, produces cache images, extracts visual query data, updates collections, categories, the timeline and a ton of other stuff. IMatch 3 just disabled the user interface to prevent users from interfering. I pondered about doing that for IMatch 5 as well, but that would have been old-fashioned. And only needed when IMatch ingests bulk data, e.g. during the initial processing of an entire collection.

The user is now warned about the consequences when IMatch starts to process data in the background, but can continue to work. Still, running WIC codecs to extract image data can utilize a system to 100% (all CPUs if the WIC codec  is written for performance). And when IMatch extracts metadata from files it may run several copies of ExifTool which cause a lot of disk traffic when reading the data, and then IMatch also causes a lot of disk traffic when moving all that data into the database.

If the background processes do their job properly, they will use as much resources as possible to get the work done quickly.

If the user works with IMatch during that time, IMatch has to lower the priority of the background processing, and interrupt it often in order to release CPU and disk resources to do other stuff. This will not affect WIC codecs which always run at high CPU to work fast, so the availability of CPU cycles may still be low.

Furthermore, a user browsing the database in file windows may trigger re-calculations of categories, collections, metadata panel access etc. All this requires a lot of database resources and also interferes with the background processing. The database needs to be locked frequently in order to ensure proper updates, and when the background processing locks the database, the IMatch user interface needs to wait. And vice versa.

Consider this: Each file added/updated in the background invalidates all data-driven categories, all formula categories, all collections, many caches used for metadata and intermediate results. IMatch delays updates of these structures while background processing is run, unless the user is working with the UI and the UI requires up-to-date categeories, collections etc. Then IMatch has to pause background processing, re-calculate everything (while the user is waiting) and then re-enable all background operations. Do that every few seconds and performance will go down to a crawl an the IMatch UI feels sluggish or even non-responding...

This is all not problematic when IMatch just processes a few changed files in the background after writing metadata or after a folder scan revealed some new or updated files. But when there are 1000, 20,000 or 100,000 files in the queue to process, this will bring down performance badly.

IMatch tries to adapt the background processing performance to the computer on which it runs. It limits the number of parallel processing threads based on the number of CPUs in the system. It monitors user activity. If the user is working on the system, especially in IMatch, it uses less resources for background processing, pauses the background processing for short periods, uses smaller batches when ingesting files in order to release database locks quicker etc.

The user can control and override the resource consumption via the settings under Edit > Preferences > Application: Process control. If a system really goes down despite IMatch's efforts to balance resources, using 1 and 1 for both process control properties usually helps. See the corresponding help topic for details.

I also consider to re-introduce the complete shut-down of the user interface (you see only a progress dialog but you cannot work with IMatch) for large bulk ingest operations, e.g. while adding thousands of files. Similar to what we had in IMatch 5. This would be the default, with an option to remove the progress dialog and work like we work now.

-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

DigPeter

Mario, thanks for the comprehensive explanation.  I do understand the reason for high resource use. For me, the problem does seem to be related to database size.  With less than say 2000 files, the delays are minimal or non-existent.  With 30000+ files, the system is inoperable, even after the ingest process has finished.

Mario

QuoteWith 30000+ files, the system is inoperable, even after the ingest process has finished.

30,000 files is about the size of my smallest test databases. I work with that daily.
There are IMatch 5 with databases of 100,000 or more files. And apparently they can work with the system just fine.

Can you attach a log file from a session? It contains performance data which may tell me something.
Did you disable your virus checker for the folder holding the database?
Is your database on a external slow disk, or even a network drive?
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

DigPeter

#37
Quote from: Mario on January 04, 2014, 01:12:18 PM
Can you attach a log file from a session? It contains performance data which may tell me something.
Did you disable your virus checker for the folder holding the database?
Is your database on a external slow disk, or even a network drive?
Thanks Mario.  I have deleted the large database, because I could not work with it.  I will reconstitute it, but this takes time, so I will let you have a log file in a day or so.

Virus checker is disabled for IMatch.

Database and files are in separate folders on an internal 500GB HDD and I have 6GB memory.

Ferdinand has a view about this - https://www.photools.com/community/index.php?topic=818.msg9204#msg9204 - to which I replied in the following post.  I am awaiting his response.

Ferdinand

It's on my list but I've been a bit busy.  Soon.

DigPeter

Quote from: Ferdinand on January 04, 2014, 02:22:42 PM
It's on my list but I've been a bit busy.  Soon.
No problem - when you have time.

cytochrome

Thanks for the answer and all the explanation, Mario.

I don't want to argue, it brings nothing, but your old computers were probably younger than my Dell (2, then 4 Gb ram and duo core pentium). And I had a lot on it besides Imatch. And maybe my OS (win7 64 bits) was badly configured. Anyway I had some crashes and a lot of freezes that amount to crashes since I had to kill iM to get out. As said, I did not dare click in the IM workspace during metadata and index updating. It is true that memory usage (total) never was more than 2.8-3 Gb. And only one core was at 100% with the second around 60% (I had set the process control to use only one processor). I had no problem running internet or whatever while IM was running, it is in IM, trying to do something while it was working with a long list of files to process, that froze it.
Something else: it happens that while updating metadata for just a few selected files, IM suddenly decided to update entire folder(s).

Since I have another machine with more power and much less programs and applications  I have experienced just a few, and short, white screens. No freeze over 20-30 seconds. Maybe a coincidence.

Of course IM also becomes more and more stable with each revision so this may explain it also.

Francis

Mario

Did you report each of the crashes you experience and supplied the DUMP file?
If you have some DUMP files from crashes, please upload them to my FTP server and file a bug report for each crash.
Did you try the performance tips given in the help?
Did you reduce the number of parallel execution threads to ease the load on your system?

If you can reproduce a crash/non-responding problem by just clicking somewhere in the UI, I want to know about it and get all the info, log files, dump files etc. To find such issues is the purpose of this Beta test. And so get performance info as well, the log file reports important performance data everywhere.

My tests and profiling runs show me that IMatch 5 is as fast as IMatch 3, or faster. And that despite the fact that IMatch 5 has a much more comfortable user interface and a vast feature set, which always comes at price.

-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Richard

QuoteIMatch 3 just disabled the user interface to prevent users from interfering.

Hi Mario,

Maybe you should do the same for IMatch 5. Many IMatch 3 users knew that some tasks would take time and would plan ahead when to perform those tasks. Like starting the process just before they went to bed. Some complained about how long it took but nobody complained about not being able to use IMatch while IMatch 3 worked in the background.

Of course if you do disable the user interface to prevent users from interfering, the result will likely be a thread as long or longer than this one. With some of the same people complaining.

cytochrome

#43
Mario,

As I said in my 3 preceding d  messages it is a new PC and I had no crash or hang, so no log  or dump. And in Edit/preferences/Application/Process Control I had set the precess control to one. (On the new PC it is 0 as recommanded)

I still have the old one, there are log files that I cannot relate to a precise crash or hang.

I also have a long list of files named (an example) "C:\Users\Francis\AppData\Local\Microsoft\Windows\WER\ReportArchive\AppCrash_IMatch5.exe_333c3cf8f32e9a4e6bfb8fc49ecdd92511530_13df8c3d"

They contain report.wer files

Again, at present IM works fine and fast, lets waite for a crash or hang

Francis


Mario

I think that's a dump file produced by Windows, which is of no use for me.
When IMatch crashes it usually shows a dialog box informing the user about the crash and the name and location of the created DUMP file. The files produced by IMatch contain a DUMP file and the IMatch log file, combined in a ZIP file with a unique name. These are the files I need in order to see where IMatch crashed and hopefully also where.

Even if IMatch did not offer a DUMP file, securing the application log file before restarting IMatch is important. See also the Beta Tester guide in the help for more info.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

DigPeter

#45
Quote from: Mario on January 04, 2014, 01:12:18 PM
QuoteWith 30000+ files, the system is inoperable, even after the ingest process has finished.

30,000 files is about the size of my smallest test databases. I work with that daily.
There are IMatch 5 with databases of 100,000 or more files. And apparently they can work with the system just fine.

Can you attach a log file from a session? It contains performance data which may tell me something.
Did you disable your virus checker for the folder holding the database?
Is your database on a external slow disk, or even a network drive?

@Mario
Virus checker is disabled for IM.
The database and files are in separate folders on a 500GB internal HDD.  I have 6GB of memory in reasonaby powerful computer.

I have created a new database with some 32000 images.  Almost all the image files had LR hierarchical subject KWs.  I used your converting script to create the IM5 database from the IM3 version.  This took about 3.5 hours.  Two things needed attention:

- Some 30 files which did not have LR hierarchical subject KWs, had flat @keywords categories outside the hierarchical structure.  I corrected this by creating hierarchical subject KWs and deleting the flat @keywords categories.  While doing this there were frequent interruptions while metadata was being read and the index updated, causing periods of minutes when there was no response from IM.  I closed the database and copied the logfile.  This is 0401 in a 4MB zip file which I am sending to you by email.

- A full set of regular categories had been created, but the@keyword categories were incomplete.  The reason for this is probably that after the conversion, there were still some 10000 files that needed metadata write back. Automatic Background processing is not selected in preferences. I set the write back in motion.  This took about 3 hours.  After this there was a lengthy period of metadata reading followed by index updating.  During this time there were again frequent unresponsive periods, including a forced closing of IM. During the index updating, IM was almost continually unresponsive and unworkable.  I closed IM when the progress message showed that there was still over 15 hours remaining.  The two log files 0501_12 and 05_15 refer.

Mario

Did you forget to attach the log files or did you send them by email?

I assume you have automatic write-back disabled under Edit > Preferences > Background Processing?

When IMatch flags all files as write-back, it has created XMP metadata during the ingest - most likely because your Edit > Preferences > Metadata settings produce new hierarchical keywords from the flat keywords in your files. Can you send a screen shot of these settings and a sample image file? Maybe you're settings in combination with the keywords in your files send IMatch into an infinite loop... causing hierarchical keywords to be created on import, on write the keywords are mapped to flat keywords in a way which causes IMatch to again create new hierarchical keywords on import, which again causes a write-back, ...

QuoteI corrected this by creating hierarchical subject KWs and deleting the flat @keywords categories.  While doing this there were frequent interruptions while metadata was being read and the index updated,

Deleting a category under @Keywords deletes the corresponding keywords from the metadata of all files in that category. This is a very fast process because it involves only database updates. And a index update in the background. No data is written to the files on disk or read. Except you have automatic write-back enabled - this forces IMatch to flush the metadata changes immediately to all files on disk.

QuoteA full set of regular categories had been created, but the@keyword categories were incomplete.

@Keywords is created solely from the hierarchical keywords in XMP. IMatch does not create "normal" categories when you ingest files. How are the regular categories created on your system?




-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

DigPeter

Quote from: Mario on January 05, 2014, 06:30:37 PM
Did you forget to attach the log files or did you send them by email?
Sent by email at 1609hrs with subject as per the heading of this thread.

QuoteI assume you have automatic write-back disabled under Edit > Preferences > Background Processing?
Yes

QuoteWhen IMatch flags all files as write-back, it has created XMP metadata during the ingest - most likely because your Edit > Preferences > Metadata settings produce new hierarchical keywords from the flat keywords in your files. Can you send a screen shot of these settings and a sample image file? Maybe you're settings in combination with the keywords in your files send IMatch into an infinite loop... causing hierarchical keywords to be created on import, on write the keywords are mapped to flat keywords in a way which causes IMatch to again create new hierarchical keywords on import, which again causes a write-back, ...
See attached.  The image (like the majority in my IM3 database) has LR\hierarchical subject KWs and DC\subject flat keywords.

Quote
QuoteA full set of regular categories had been created, but the@keyword categories were incomplete.

@Keywords is created solely from the hierarchical keywords in XMP. IMatch does not create "normal" categories when you ingest files. How are the regular categories created on your system?
I am not intending to have many regular categories.  In the case of this database, they were created by the conversion process from IM3, but Ihave deleted them.  I plan to use @keyword categories generated from LR\hierarchical subject KWs.  That is why I have set metadata preferences as shown in the attachment.  I will be creating data-driven categories by combining @keywords categories.

QUESTION I have a list of some 4700 assigned categories (mostly botanical taxa).  Is this likely to make processing very prolonged?

[attachment deleted by admin]

Mario

Ah, automatic write-back enabled. That explains a lot. Every change you make to metadata in your database will cause an immediate write-back and re-import. Worst-case scenario. Disable automatic write-back temporarily and see how IMatch behaves. It should be much faster and responsive, especially while you do major re-arrangements and work with your keywords.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

DigPeter

No Mario - I answered "yes" to whether it is disabled.