Background processing allocation

Started by HansEverts, October 31, 2013, 07:07:25 AM

Previous topic - Next topic

HansEverts

Is it possible, and if not would it be possible for the user to set the amount of memory used by the background processes. IMatch has just been reading metadata for a couple of hours, during which I could sometimes do something quickly, but most of the time I could do nothing. The activity panel did not indicate anything special and the EXIF output panel was empty. I am confident those processes are necessary, but would it not be possible to allocate them less memory or even postpone them and launch them before going to bed.

Mario

How much memory was used? Are you use this was related to memory?
IMatch usually needs 300 to 600 MB at all (except you run the Viewer or the Slide Show which does use a lot of memory for read-ahead caching).

You can control how many concurrent threads IMatch uses for background processing. IMatch by default uses a number which depends on the number of processors in your computer an which does not use up all available CPU time. And in most circumstances, the problem is not the CPU but the disk which cannot keep up and slows down.

Have you tried reducing the number of parallel background threads under Edit > Preferences > Application: Process Control? See the doc in the help for details.

HansEverts

Thanks,

I tried a few different values in Process control, but you never know if it is really the same task you are measuring and I don't really know which number to look at in the Windows control panel, especially because they all keep jumping up and down. So never mind, but since the start of the beta testing I think it would be useful if somehow the user could have better insight in ongoing processes in particular the ones that block the rest of the application. I know I am vague, and I am sorry, but I think the issue is worth mentioning.

Gerd

Hi,

I made the best experience with the process-control-setting 1 and 1, as Marion mentioned earlier.
_______
Regards
Gerd

Mario

Quote from: HansEverts on October 31, 2013, 07:32:42 PM
Thanks,

I tried a few different values in Process control, but you never know if it is really the same task you are measuring and I don't really know which number to look at in the Windows control panel, especially because they all keep jumping up and down. So never mind, but since the start of the beta testing I think it would be useful if somehow the user could have better insight in ongoing processes in particular the ones that block the rest of the application. I know I am vague, and I am sorry, but I think the issue is worth mentioning.

The IMatch log file includes counters which are dumped at the end of the log file when IMatch exits normally. These counters show the maximum and average times for certain critical tasks, including indexing files, reading/writing metadata, search engine updates etc.

Please note that the "time" needed to process a file vastly varies. It all depends on the file format, the amount of metadata contained in the file, the format of that metadata and more. The typical RAW codecs take between 0.5 and 20 seconds to read a file, depending on whether you make IMatch only produce thumbnails or cache files also at ingest time. There may be variations of 50% in the time single files are processed. Furthermore, IMatch processes files in batches, and varies the size of these batches on how long the last batches took, in order to minimize the time the database is locked.

But during ingest, a lot of things have to be done. Previous IMatch versions just locked the user interface with a wait dialog and prevented the user from doing other things in IMatch. This is not longer the case for IMatch 5. You can do work while IMatch is still ingesting data, but it all depends on how fast your system is. The speed of the disk is the most important point here because IMatch basically reads/writes data 90% of the time.

In short: If your system becomes to slow while IMatch is ingesting huge amounts of files, just let it run in peace until finished. Trying to pile more work on top of the already 100% utilized system, e.g. by working in IMatch, will do no good.

If you set the background processes to 1 and 1, and your system is still too slow to work with it, I would like to see a log file so I can see which operations are the slowest. Maybe it's the WIC codec, or the disk is slow and the bottleneck.

HansEverts

Thanks Mario, I will send a log file.
But please understand that I am really enjoying working with IM 5 and am not being critical about how long processes take. I am simply trying to make a point that if I would know some processes are going to take several hours, I can launch them before going to bed. I don't know if the duration of these processes can be known in advance, and I am sure it can only be approximate, but there is a difference in being warned that something will take a few minutes and waiting for hours of writing metadata and indexing. Is there a possibility to create 2 or 3 categories like less then 15 minutes, from 15 minutes to 1 hour and more than an hour?

Of course, if I am the only user for who this is an issue, forget it.

Have a nice weekend

Mario

#6
Not sure that I understand. Is the Info & Activity Panel not showing estimates?
All long-term operations either use a wait dialog with a progress bar and an estimate, or if running in the background, show estimates in the Info & Activity panel.

I find them quite accurate, at least if you don't process vastly different file formats, e.g. mixing JPEG, RAW and PDF files during ingest. If this is the case, no time estimate can be accurate... if IMatch starts off by happily flying through JPEG files with almost no metadata (hundreds of files per minute) and then suddenly dives into RAW files which take 5 seconds to load per file, estimates will be off for a while until the estimate adapts...

I often build databases with 10,000 to 50,000 for testing purposes. I know that the 10,000 files database takes maybe 20 minutes, depending on the files. The 50,000 files database takes a couple of hours (RAW mostly, lots of data to process). Building cache files at ingest time will double or triple the ingest time, depending on the file formats and codecs.

HansEverts

Right now the info panel indicates 58 minutes left of reading metadata. But 1) can I interrupt that and let the application do that later and 2) I had several times that after the 58 minutes, the color changed and IMatch started updating the index or something else even though I had not even touched the computer. That meant that processes were already in the pipeline. Is it possible to show on the info panel all processes in the pipeline and not just the one currently running?

Mario

IMatch always (see also the help for details) performs four steps when ingesting files.

1. Ingesting the info about the file.
2. Reading metadata
3. Producing the thumbnails, cache files, visual query data, check-sum
4. Updating the search engine index.

Depending on the speed of your computer and the number of processors, 2 and 3 may run at the same time. 4 can only run when 2 is completed.
The Info & Activity Panel displays individual "bars" in different colors for each of these processes. Each bar shows it's own estimate (58 minutes for reading metadata in your case).

You cannot "stop" the background processing because the files are already in the database, but there is no metadata or thumbnails yet. This situation has to be resolved before the database can be of use.

How many files are you processing in 58 minutes? 2,000 or 20,000?
Especially while reading metadata, the database is very busy. If your files have about 150 metadata values each and IMatch, adding only 1,000 new files causes 150,000 (!) new records to your database. This is a lot of disk activity. If you look at the "Performance" tab in the task manager (or the Resource Monitor in Windows) you can see how busy the processors are, and the disk. I think the disk is running at 100%.

Maybe I should really block the user interface again (like in IMatch 3) while IMatch is busy ingesting files. It seems to be a problem for you and for some other users with slower computers, and trying to work in IMatch 5 while the computer is utilized at 100% anyway will do no good.


BenAW

Quote from: Mario on November 02, 2013, 08:30:41 AM
Maybe I should really block the user interface again (like in IMatch 3) while IMatch is busy ingesting files. It seems to be a problem for you and for some other users with slower computers, and trying to work in IMatch 5 while the computer is utilized at 100% anyway will do no good.
I proposed the same in this thread.

Perhaps later on a routine could be used that determines the "speed" of a system, its harddisk etc and on that base allow working with IM5 while background tasks are being performed or not.

HansEverts

I think my computer is fast enough: Intel Core(TM)2 quad,Processor speed: 2,5 GHz,
Built-in memory: 8191,2 MB

I think the idea of blocking the interface is a good one. I probably launch probably too many tasks.

HaSt

Importing roughly 40.000 images I ran into similar problem. What slowed down the process?
1) Preferences > Background Processing > Metadata Write-back> checked
2) Preferences > Indexing > checked all new Files (either collection or category)
As soon anything is checked here, the image will receive a new value and iMatch starts to write back while importing new images in parallel, resulting in a drastic performance issue.

Solution:
1) Preferences > Background Processing > Metadata Write-back> Uncheck for the time of the mass import.
2) Manualy force Metadata write: Commands > Metadata write-back > For all pending files

Hans

Buster

To keep my USB-HDDs alive, I wrote a little batch, which copy a 0-byte-file every ten seconds to the drives. This helps to speed up writing processes after longer breaks.

:copy
copy /Y c:\windows\a.txt d:/a.txt
copy /Y c:\windows\a.txt u:/a.txt
copy /Y c:\windows\a.txt i:/a.txt
copy /Y c:\windows\a.txt l:/a.txt
timeout /T 10 /nobreak >NUL
goto copy
---
Best wishes,
Reiner