[UPDATED] Problem: Wrong folder timestamps in Windows!

Started by Mario, December 28, 2024, 07:18:10 PM

Previous topic - Next topic

Mario

Similar issues were reported very occasionally, but without a repro case. Well, I was "lucky" today ;)

I did a bit of cleanup work in the Windows file system today, downloaded a bunch of images from my cameras and phones into existing folders. Made some edits, too.

When I started up IMatch, I expected it to detect the folders with new and modified files. And it did.
Except for two folders! But why?

When a database is loaded, IMatch scans all folders in the background to find folders which were modified since IMatch was last run . If a modified folder is found, IMatch adds it to a background processing queue to check for new and updated files.

IMatch did that alright, but considered the two folders as "current".
To make this decision, IMatch compares the "last modified" timestamp of the folder recorded in the database with the "last modified" reported by the Windows file system. If they match, the folder is considered as current.

I checked the folder's "last modified" timestamp in Windows Explorer. It reported December 25. 2024, 10:00:05.
But the last modified timestamp of some of the files I've copied into the folder was December 28. 2024, 16:50:10.

Windows apparently did not update the "last modified" timestamp of some folders, despite files in them were changed. Which should of course not happen. I've searched the Internet and found quite a number of similar reports, but without an actual solution. This seems to be a long-standing bug in Windows.

What does this mean for IMatch?

Since IMatch relies (has to) on the "last modified" timestamps of folders to figure out if it needs a rescan, it fails to detect folders with this issue. Unless the user performs a manual rescan or makes changes to the folder while IMatch is running, the changes remain undetected.

To deal with this problem, I've enhanced the "background folder sweeper" task in IMatch to not rely on the last modified folder timestamp alone for the initial scan after database open, but to also find the newest file in each folder. If the file is newer than the folder's "last modified" timestamp, IMatch does two things:

a) Enqueue the folder for rescan
b) Update the "last modified" timestamp of the folder on disk to the "last modified" of the newest file

This allows IMatch to deal with the problem, and fix the wrong time stamp at the same time. Since the timestamps has been corrected, the folder will not be enqueued when the database is opened the next time (unless it was really changed).

I've tried this with several of my databases, and all databases except one had folders where the folder timestamp on disk was older than the newest file in the folder. Strange.

What do you think? Is this approach viable?

thrinn

Quote from: Mario on December 28, 2024, 07:18:10 PMWhat do you think? Is this approach viable?
Sounds like a good solution. It is still a pity that such a workaround is necessary. Ideally, Windows itself should take care of the issue, but hey, we don't live in an ideal world.
Thorsten
Win 10 / 64, IMatch 2018, IMA

JohnZeman

Given the situation to me it looks like the best solution to this potential problem.

sybersitizen

Quote from: Mario on December 28, 2024, 07:18:10 PMTo deal with this problem, I've enhanced the "background folder sweeper" task in IMatch to not rely on the last modified folder timestamp alone for the initial scan after database open, but to also find the newest file in each folder. If the file is newer than the folder's "last modified" timestamp, IMatch does two things:

a) Enqueue the folder for rescan
b) Update the "last modified" timestamp of the folder on disk to the "last modified" of the newest file
Seems to be necessary.

Will that update work even if I have IMatch set to NOT actually write changes to the disk but only to the database (which is how I currently have it set)?

Mario


QuoteWill that update work even if I have IMatch set to NOT actually write changes to the disk but only to the database (which is how I currently have it set)?
This is unrelated to writing metadata to images.

It's just an improvement / work-around for the problem that, sometimes, the last modified date of a folder is not reliable, preventing IMatch from detecting that something in these folder was changed while it was not running.

graham1

I assume that checking the file modified time as well as the folder modified time will at least double the time IMatch takes to carry out its check.  Presumably if the folder modified time is checked first and is found to have changed, it will not be necessary to check the files in the folder and this step can be skipped, because the folder will be updated anyway. 

Even if it does take longer, I agree it is a good idea to add this additional check.  I have been puzzled in the past that IMatch has not updated images in folders which I know have been changed, and if this is the price for making the process more reliable, so be it. 

Graham

Mario

QuoteI assume that checking the file modified time as well as the folder modified time will at least double the time IMatch takes to carry out its check.
This is done only once, after database start and runs in the background. You won't notice any difference.

QuotePresumably if the folder modified time is checked first
Yes. Only when the modified date matches the database, a "deep check" is performed.

QuoteI have been puzzled in the past that IMatch has not updated images in folders which I know have been changed,
Probably this was caused by exactly this problem. If the folder timestamp was not changed by Windows, IMatch had no way to tell!

Jingo

Yes.. a great solution.. the fact that you also clean up Windows bug and update the folder timestamp is an extra boon so the folder doesn't get hit for update over and over!

Mario

Quoteso the folder doesn't get hit for update over and over!
That's exactly why IMatch has to fix the folder timestamp. Else it would be flagged for rescan after every database load and that would be unnecessary.

I've checked all of my 10 real and test databases, and each one, except one, had a few folders with this problem. So this is not a real serious problem, especially now, with IMatch being able to detect this.

graham1

This may also explain why sometimes if I delete an image in another application, it does not show up as a missing file in IMatch (which often is the case).  This proposed fix will not remedy that issue, I suspect, if the deletion has not modified the folder timestamp.

Graham

sybersitizen

Quote from: Mario on December 29, 2024, 10:11:46 AM
QuoteWill that update work even if I have IMatch set to NOT actually write changes to the disk but only to the database (which is how I currently have it set)?
This is unrelated to writing metadata to images.
That's what I expected, but asked just to make sure.

dcb

An elegant solution. Can't see anything wrong with that approach.
Have you backed up your photos today?

Mario

Quote from: graham1 on December 30, 2024, 12:55:33 AMThis may also explain why sometimes if I delete an image in another application, it does not show up as a missing file in IMatch (which often is the case).  This proposed fix will not remedy that issue, I suspect, if the deletion has not modified the folder timestamp.

Graham
I think so. If IMatch considers the folder as "current", it does not rescan it.

New, modified or deleted files are thus not detected until you make a change to the folder while IMatch is running (and IMatch rescans it because it receives "folder modified" messages from Windows) or you perform a manual rescan.

Mario

I've had a feeling about this, and I was right.

After spending several hours, trying to figure out if this is a glitch in Windows, a glitch in IMatch, or something I've overlooked, I'm now smarter, and a bit humbled.

While IMatch is running, it monitors Windows file system events to detect new, deleted and modified files. This works fine.

When a database is opened, IMatch checked the "last modified" timestamp of folders to find folders modified while IMatch was not running. And when a modified folder is found, IMatch schedules it for a rescan. The rescan then checks for new and updated files. This does not always work. Blimey!

Because, as I have learned by browsing the web and from my own experiments, not all file system operations I considered actually update the "last modified" timestamp of folders.

Most importantly: modifying a file does not change the "last modified" timestamp of the containing folder.
Adding, deleting, renaming as file/folder does update the "last modified" timestamp of the containing folder.

This means my approach to detect modified folders during database load was faulty. My bad. Sorry
This also explains my findings in the initial post above.

The only way to be sure to detect modified files would be to perform a full rescan on each database folder after database load - but that would be really slow and badly interfering with the user experience.

My solution is now as follows:

During the initial folder scan, IMatch checks for the most recent (supported) file in each folder in the database, using the "last modified" timestamp in the file system.

1. If the file is not in the database, IMatch schedules a rescan for the folder since it is obviously modified.

2. If the file is in the database and has the same "last modified" timestamp, the folder is considered as "current".

3. If the newest file is current, but has a "last modified" timestamp newer than the folder in the database, the folder is updated with the current date and time. This marks it as current, and the next rescan on database load will consider it as current automatically (unless a newer file exists).

Step 3 is an "auto fix" that is performed during the first rescan. IMatch now uses the current date and time in UTC when setting the "last modified" of a folder in the database (e.g. when a rescan is performed, files are added/removed), instead of using the folder's "last modified" timestamp on disk. This is more robust.

What You Have to Expect

When you open a database for the first time in IMatch 2025, IMatch may find several folders to rescan, because they contain previously unnoticed changes to files. This is a good thing, of course. But be prepared.