Hunting intermittent write back hangs

Started by lbo, November 18, 2018, 02:16:40 PM

Previous topic - Next topic

lbo

Hi all,

I'm looking for suggestions what I can still try to find the reason for intermittent hangs during writeback.

Sadly the problem vanished today time just after enabling verbose ExifTool output, so I have no valuable information from the ExifTool output panel.

As far as I see, the logfile (snippet attached) doesn't tell more than the missing response from ExifTool.

Maybe interesting: The raw file DSC01140.ARW has been written correctly to the disk one or two seconds after I clicked the yellow pencil (12:51:05 PTETWrapper::ProcessRun, 12:51:06 file timestamp).

So if I understand correctly, exiftool did write the file quickly, but PTWrapper didn't notice it's completion.

This happened now several times and after some time, write back worked again.

I'm not running 3rd party virus scanners (just Win 7 defender "light"), and the directory where I did the current exmeriment isn't even monitored by the Windows indexing service.

Any suggestions what I can try to find the reason?

Mario

IMatch monitors ExifTool instances. The execution time of a command may depend on many factors, from file size to file format, the amount of metadata to write and the file storage location (local SSD, local disk, external USB disk, remote disk). Typical write times for medium-sized files are between 0.5 and 5 seconds.

To determine if ExifTool just slow to process a specific file or actually hangs, IMatch monitors the CPU and file I/O consumption of every ExifTool instance.

If ExifTool fails complete a task within 60 seconds (!), IMatch aborts the instance and creates a new one. Usually this only happens when there is something wrong with the file or an external software interferes. Or if the system breaks down completely and the Windows file system becomes overloaded.

I assume you are testing this on your slow notebook (see the other thread). If you have problems with write-back it may be caused by your system getting too slow. Please follow the instructions in the release notes about the new parallel write-back and how to reduce the stress on your system by reducing the number of parallel write-backs to 1 or 2. I mentioned that already in your other question from this morning. Same settings page, same help page: Process Control (Advanced Setting)

A notebook with a 5400 RPM disk is rather medium-end hardware and may need some custom tweaking. Also temporarily close other applications, disable your virus checker for a test to make sure it does not interfere etc.

lbo

I already had set all three process control options to "1" before I made the write back tests.

According to the ressource monitor, other tasks did not load the system.
Only the IMatch related tasks showed relevant CPU, Disk and memory activity.

And keep in mind that the arw file has been written correctly after a second, only IMatch didn't get the notice about the successful writeback and terminated 120s (not 60s) later with a timeout.

Hard to catch since it's intermittent. Most times writeback works well (approx. 1s per file). All of a sudden the problem occurs and just as suddenly it disappears again after several minutes. This time even in the same IMatch session (last time I restarted IMatch).

Now I continue working with "debug" log file and verbose ExifTool output and wait until it happens again.

Anything else I can prepare?

Oliver

Mario

IMatch cannot know that ExifTool has "written the file after one second" (how did you determine this?) as long as ExifTool does not send the "completed" token over the pipe (IMatch communicates with ExifTool via a mechanism called pipes). If ExifTool does not respond, IMatch will terminate it after 60 second (or 120, not sure right now). I thing on systems considered slow IMatch increases the wait time automatically.

If this even happens when only one thread is writing to files at a time and ExifTool crashes or fails to commit the write-back for some other reason, this is either caused by a corrupted file, a file locked by something else or maybe a virus checker. Do you have the Viewer or quick View panel open while you write back the files? If so, maybe the file is locked by the WIC / DirectX system (happens sometimes with some WIC codecs). But in that case ExifTool will just time-out, but you said is has written the file after one second.

Looking at your log file, I guess that some software on your system causes this.

For example, when IMatch starts, it runs ExifTool to get the version information. And the log file reports "PTWrapper hangs in ProcessRun". Which means that not even this call succeeded, which means that either your installation is broken (unlikely) or that your virus checker or whatever blocked ExifTool from running. Check your virus checker settings and make sure IMatch is allowed to launch external processes. If your virus checker is the cause and it is good it should log the block into a log file somewhere...

dkorman

I, too, have experienced iMatch "intermittent hangs," which can be frustrating and I share your "pain."  The problem can be in Mario's code, iMatch's 3rd-party services, other 3rd party services, Window's services, and the hardware environment - not to mention the possibility of malware or "poorly behaved" installed programs.  Very complex.  Additionally, there are data packets being sent out and received between my machine, IoT devices, and to and from "The Cloud," by whom for what purpose?  It's the "Wild West" here.  It used to be that I believed that I knew what these machines were doing, and be able to control them - not so much anymore.  (Same with my car!)

So, here's a story about my generic iMatch performance/hanging issue and a fix which may relate to your problem (or maybe not?) ...

iMatch started behaving erratically on a machine that I use regularly (and heavily) - for several minutes, iMatch would act very smoothly, and then it would become "unresponsive" for a few to many seconds, doing seemingly the same type of operations, or when I switched to it after using other programs.  Was iMatch the cause?  I know that iMatch is doing a lot of processing, but this particular start/stop was getting particularly annoying on this particular machine (turning red in Task Manager).  While most open programs did not demonstrate this behavior, I started noticing that some memory-intensive programs did (eg. Acrobat, File Explorer, and a couple of database programs).  The iMatch "hangs" were seemingly random - not when I was always performing a specific iMatch task.  Thus, I started to monitor my memory usage with Resource Monitor and Task Manager - the 12gig memory was mostly in the 50% "In Use", a small amount "Modified," and the remaining "Standby" and sometimes with a small amount "Free."  Based on Resource Monitor's Memory Use information, it would seem that memory was NOT a problem.  Additionally, in monitoring, I noticed that Norton Security would seem to randomly and frequently pop up to the top of my CPU usage, particularly when iMatch was writing or updating files, consuming 20+% of CPU, as well as a large amount of page faults and presumably placing many memory pages in Modified state.  Otherwise, CPU usage was less than 30%.  I also noticed Windows 'Memory Compression' popping up randomly, which made me think that perhaps memory WAS a problem.  Chrome was not hanging, but I'm sure that the 10-30 open tabs were eating memory.  Could iMatch's random "hangs" be a memory issue?

I took a chance and added more memory (to 30gig - ~$200), and 95% of the issues with iMatch and other programs stalling have now seemingly disappeared (Norton is still obnoxious, but hopefully doing "important" work).  With 30gig, "In Use" memory is now 11-12gig, which I therefore assume meant that the programs I was running previously were "memory-starved," even though it didn't seem so from Resource Monitor. 

Clearly, even though iMatch consumes resources and is a "good neighbor" (not a 'resource hog' - thank you Mario), other programs do not always "play nicely," which can adversely impact one's iMatch performance experience (though, because of what iMatch processes, it will always have considerable work to perform - and it's why we use it!).  Since upgrading memory, I have not seen Windows Memory Compression service become active (but maybe I've just missed it).  We expect a lot from iMatch, but it can be burdened by its "neighborhood," over which Mario has limited or no control.  The additional memory has made a "world of difference" - iMatch's random "hangs" are now rare (and I hope it stays that way).  (Specific issue "unresponsivenesses," e.g. when conducting an intensive process, still do occur, though predictably.)  Has anyone else had a similar experience?

I also have a Samsung SSD C drive (data and paging file on other drives).  I was told that the Samsung Magician, which grabs available memory to cache the SSD, improves SSD performance but adversely affects memory availability for other programs.  I have yet to test disabling Magician - still more research to do. 

Surely, iMatch will have some "hiccups," which Mario has been open to hear about and attend to - a greater responsiveness than any other software developer I know (Mario, do you ever go on vacation?).  Any user of iMatch can attest to its continual improvement, with, I'm sure, more to come.  And our comments on problems and issues with iMatch will only make it better.  Will iMatch's new telemetry feature result in performance improvements?  I hope so.  (Hey, Mario, if you get the chance, please share some more about what you're learning from telemetry - we are all fascinated.)  We can, however, do what we can to make iMatch's operating environment as "friendly" as possible, despite what Microsoft and other software developers do.  And, as I have seen by increasing memory (a relatively easy fix - as long as you get the right memory and know your motherboard slot requirements), some changes can have a vast improvement on the iMatch performance experience. (An SSD for the database will come next, I think.)

If anyone has suggestions on how you have improved your operating environment with particular attention to how it has impacted iMatch performance (e.g. known "bad actors,"  problematic services or settings, etc.), I know that I would would appreciate learning about them.  (Maybe the discussion board should have a separate section on operating environment observations.)

May your "hangs" be few and the resolutions be quick and painless.

Thanks as always,
David





lbo

I think the important point is what I can do to get more diagnostic information if the problem arises next time.

Quote from: Mario on November 18, 2018, 07:02:58 PM
IMatch cannot know that ExifTool has "written the file after one second" (how did you determine this?)

I determined this via the file timestamp: "The raw file DSC01140.ARW has been written correctly to the disk one or two seconds after I clicked the yellow pencil (12:51:05 PTETWrapper::ProcessRun, 12:51:06 file timestamp)."

"Written correctly" means that the file had the expected contents, my edits had been written to the file.

After a rescan (even a restart), the yellow pencil stayed but hovering over it show no entries (likely the Pencil should vanish but this is a minor issue IMO).

Quote from: Mario on November 18, 2018, 07:02:58 PM
If this even happens when only one thread is writing to files at a time and ExifTool crashes or fails to commit the write-back for some other reason, this is either caused by a corrupted file, a file locked by something else or maybe a virus checker.

IMO unlikely, but to answer your question: Viewer or quick View panel were not open while I wrote back the files. Just metadata and map panel. Maybe the Filter panel.

Quote from: Mario on November 18, 2018, 07:02:58 PM
And the log file reports "PTWrapper hangs in ProcessRun". Which means that not even this call succeeded, which means that either your installation is broken (unlikely) or that your virus checker or whatever blocked ExifTool from running.

It's the plain Defender from Windows 7, no SRPs or other custom restrictions.

I use ExifTool heavily from the command line, with ExifToolGUI, with GeoSetter, and there are no indications that Defender or other parts of my system interfere with my ExifTool installation.

Mario

If ExifTool does not complete even the simple "What is your version" call IMatch runs during startup, something with the installation or system is wrong. This simple call usually completes in 0.2 seconds. On your machine, IMatch had to shut-down ExifTool after 120 seconds because it failed to respond. Or it was blocked (more likely) by something on your system.

Start by defining IMatch2018x64.exe, exiftool.exe (in the IMatch program files folder) as exceptions in your virus checker. Check the digital certificates embedded in these files in Windows Explorer to ensure they are original and untampered with. Maybe just reinstall IMatch and reboot your machine once completely.

lbo

Quote from: Mario on November 19, 2018, 09:36:06 AM
If ExifTool does not complete even the simple "What is your version" call IMatch runs during startup, something with the installation or system is wrong.

the startup was not contained in the log file. As I wrote, it was a snippet, which is also obvious from looking at the log file. It starts before the hang.

At the moment, I was just asking what I set to produce valuable debug information. I'll send it if it happens next time.

Mario

Always include the full log file.
Always switch IMatch to debug logging when attempting to diagnose problems.See log file
Document your changes, e.g. did you define exceptions for IMatch in your AV as I suggested above?