Metadata write-back requires running the command twice

Started by Pascal, December 19, 2023, 10:42:02 AM

Previous topic - Next topic

Pascal

On my computer with iMatch 2023.4.6, writing back metadata cannot done all at once (or rarely). This is annoying because I have to run the Shift+Alt+S command twice so that all modified metadata are saved in full (until "Pending write-back" displays "No files to write"). The problem is systematic as soon as there are a few dozen or more files affected. N.B. I never saw this behavior with iMatch 2021 but I already saw it with 2023.4.5.

My database contains 93,000 files, maybe that's too many? My computer is a core i7 with 32 GB of RAM but it is from 2017... Maybe it's too slow for iMatch 2023? I sometimes have difficulty selecting an item or menu command because the focus keeps jumping from one place to another until all the background processes of iMatch have finished.

Thank you in advance for your help

Pascal

Mario

93,000 files is nothing. The average database size is about 150K, with a large number of database around 300K files.
The largest database I use daily has 980,000 files currently. My other 200K database runs very well on my laptop.

Write-back behavior and performance is unrelated to database size.

Quotewriting back metadata cannot done all at once (or rarely).
What does that mean? Do you get error messages? Does IMatch stop writing back files?
Or do you have to write-back files twice because the existing metadata in the file is out of sync and requires multiple write-backs to fix? See Metadata Problems and Pitfalls for details and explanations.

QuoteI sometimes have difficulty selecting an item or menu command because the focus keeps jumping from one place to another
When does this happen? Where does the focus jump to and from what?
What is IMatch doing in the background? Face recognition? Trying to assign faces to persons?
What does the Dashboard or the Info & Activity Panel tell you in this case.

To start looking into this, we need a log file in debug mode (Help menu > Support > ) from an IMatch session where you've experienced this behavior.
This will show us what IMatch is doing, how long tasks take, the overall performance of the system, memory utilization and many other things that will be helpful in diagnosing this.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

janb83

Quote from: Mario on December 19, 2023, 11:52:16 AMWhen does this happen? Where does the focus jump to and from what?
What is IMatch doing in the background? Face recognition? Trying to assign faces to persons?
What does the Dashboard or the Info & Activity Panel tell you in this case.

Because I had a similar issue, being unable to work with IMatch while it was doing background tasks due to switching focus, let me describe what happened in my case:

I noticed this during two actions if I recall correctly: "Rescan Metadata" on a few thousand files, and write-back to a few thousand files. But I think during the write-back it only occurred AFTER it was done with the progress dialog and was only updating the database because the files had changed.

In either scenario, I think the main culprit was an active filter. I noticed a flickering (showing, not showing, showing, not showing etc.) message about this ("Applying filter", maybe?) in the status bar, and in one of the two occurrences this alternated with a message about updating the folder view I think. Once I managed to disable the filter despite the focus problem (quick clicking!), things got better. But it still felt like sometimes the focus was switching away to some other UI element.

Mario

Every time something in the database changes and the Filter Panel has at least one active filter, it schedules a re-apply for the filter. Which then in turn reloads the File Window.

Depending on which filter(s) you have enabled and the size of the scope (a folder with 500 files or a disk/category with 50,000 files), the Filter Panel will be busy for a short time. Especially when it has to "compete" with other background processes for database time. Or when the disk is already maxed out, performance-wise.

If this causes stress issues on your system, I recommend to write-back or rescan folders when the Filter Panel is paused or closed. It can be one of the most demanding panels, performance-wise, depending on what you filter for.

If IMatch is indexing files in the background or writing back, which also requires a re-ingest of all metadata, this will happen very often. I'm not sure why this would cause Windows to focus other windows? or panels?

You can pause the Filter Panel with <Ctrl>+<F6> or close it with <F9>,<F> if this should happen again. This should be easier than trying to click the X icon or Pause button.

If you can produce a log file in debug mode (see above) I may be able to see what's going on.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

janb83

Keep in mind that one might need the filter to select the files for which e.g. the Rescan should be called. So, the only way to avoid this (as a user) is to first use a filter to select the files one wants to process and then either make sure the selection is still active while disabling the filter or open the selection in an output window.

From a program perspective, this can be solved differently. A simple workaround would be to warn the user about this and offer disabling the filter. Or, reducing the frequency. As you say, it seems to schedule one re-apply PER individual db change. During operations like Rescan Metadata, this basically triggers thousands of re-applys which will happen in quick succession. One approach to solve this is to delay the re-apply by 1-2 seconds, and if another re-apply is triggered during that delay, delay it again. Only after no further re-apply is triggered during the current delay it will actually be done. Or, if you're afraid that is too long, ignore the delay every 10 seconds or so.

As for the focus problem itself. It's definitely happening, something is grabbing the focus during this scenario. Not sure what, candidates would be the status bar, the DB update icon, the filter panel, or the file window. As you rightly say, WHY it should focus one of these is a mystery.

Btw, it's not "a short time". Rescanning the Metadata for 30k files took about 3 hours. It could be that this re-apply "problem" that is causing the focus issue is also significantly slowing down the process itself due to the extra workload.

Mario

QuoteA simple workaround would be to warn the user about this and offer disabling the filter.
Agree. But since this does not come oft open (or at all), I wonder how many users will be affected by this?


QuoteOr, reducing the frequency. As you say, it seems to schedule one re-apply PER individual db change. During operations like Rescan Metadata, this basically triggers thousands of re-applys which will happen in quick succession.
The filter panel schedules a re-apply. This means it accumulates and waits until the events stop or it has to refresh the File Window in order to not show invalidat information. The same applies to categories, data-driven categories, collections, events, people, face recognition and all the other database contents that are affected when files are added or updates, metadata is changed etc.

QuoteBtw, it's not "a short time". Rescanning the Metadata for 30k files took about 3 hours.
What? That's a lot.
Are these super-larger RAW or TIFF files on a network server or NAS? Or JPG files on your local disk?
Maybe try to reduce the number of parallel threads if the system slows down under prolonged load? See Process Control (Advanced Setting) Do you use automatic face recognition and or reverse geocoding? Do you apply Metadata Templates during ingest or the Renamer? Complex versioning rules? Write-back set to immediately?
Which virus checker do you use? Have you configured an exception for the folder containing the database?

Please show us a log file so we can see what takes so long.
This should be way faster.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

janb83

It's an external SSD, connected via USB 3.0 only, unfortunately. Just regular JPG files with below-average size (because a lot of them are old <1Mb files, only a small percentage is modern large digital). Do you log automatically or do I have to enable it? Because I have to run a similar task again soon, so I can certainly try to grab a log.

Mario

See The IMatch Log file and make sure to enable debug logging.
The typical performance for JPEG files is 200 to 400 files per minute.

Just made a small test. 1,500 JPG files from various cameras and smart phones on a normal hard disk. Database on SSD.
3:39 from the time I've dropped the folder into the Media & Folders View until the loader overlay dialog closed and the import is done. About 440 files per minute. 4 year old PC.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

janb83

Well, 200 files per minute would already be 150 minutes for 30k files, so not far off from what I experienced. I'll take a look at the logging before I run the next action like that.

Mario

200 files would be on a slow notebook, I suppose. On a fairly modern PC, e.g. my new laptop, it's more like 400 to 600 JPEGs per minute. Depending on how much metadata the files contain and their MPs.
If the laptop becomes to hot and the CPUs throttle, things will slow down. In that case dialing down the number of parallel threads (Process Control (Advanced Setting) can actually improve performance.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

DigPeter

I have this problem (but live with it), if I have entered new metadata AND deleted existing metadata.  The file(s) concerned have to have more than one write-back.

Mario

Run the The Metadata Analyst on some of the files and see if it reports any problems.

Which tags did you change?
When IMatch needs a second write-back, which tags are listed when you hover the mouse over the write-back pen in the File Window?
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Pascal

I shouldn't have brought up two different subjects in the same thread, sorry...

1)"writing back metadata cannot done all at once (or rarely)."
Quote from: Mario on December 19, 2023, 11:52:16 AMWhat does that mean? Do you get error messages? Does IMatch stop writing back files?

No error messages, no warning. When the write-back process ends completely, there are often randomly remaining files with pending metadata.
I will investigate further in order to be more precise and generate a log file.

2) "I sometimes have difficulty selecting an item or menu command because the focus keeps jumping from one place to another"

Quote from: Mario on December 19, 2023, 11:52:16 AMWhen does this happen? Where does the focus jump to and from what?
What is IMatch doing in the background? Face recognition? Trying to assign faces to persons?
What does the Dashboard or the Info & Activity Panel tell you in this case.

I don't use face recognition. The focus is unstable for example when I am in the Categories view and I try to make changes or move a keyword in the @Keywords tree. In one case I even faced a catastrophic result: I deleted the wrong keyword but (maybe I wasn't paying enough attention) I only discovered the mistake later, and thousands of files were affected...

I've seen in the meantime that other people have described the same phenomenon. In the same vein, it seems that sometimes user events are lost. For example, if I select a hundred thumbnails and click on the red pin in one of the selected thumbnails, sometimes the mouse click has no effect. I have to click a second time for all the selected files to be assigned to the "red pin" collection.

I'll try to be more specific later.

Mario

Quotehe focus is unstable for example when I am in the Categories view and I try to make changes or move a keyword in the @Keywords tree.
Use the the Keywords Panel  to add keywords to your files. It is far more efficient and offers a lot for functionality for adding keywords to files.

When you select a @Keywords category in the Category View and then move or copy that file to other keywords, you trigger a recalculate of the @Keywords hierarchy. And this will also require the File Window to reload. Maybe even several times, depending on your keyword structure and active File Window. This is not a good workflow and will effectively work against you. Changing the data on which the currently displayed File Window is based should be avoided. If you have to, select all files you want to process and press <Ctrl>+<G>, <R> to open the selected files in an independent result window. Now the selection is no longer dependent on the data you change and things will work smoothly.

QuoteI've seen in the meantime that other people have described the same phenomenon.
In this community? Then links, please.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

rienvanham

Hi Mario,

I think it was me: I wrote in an earlier topic that I have to save the metadata twice for (almost all) PDF's. Today I made 3 copies of the same file:
1: Original
2: After first "ALT-SHIFT-S"
3: After second  "ALT-SHIFT-S"

I took a look at all the metadata in ExifToolGUI and saved the metadata to TXT-files (for step 2 and 3).
I compared the output and saw an interesting thing:
After the first write most times have no timezone;
After the second write they have!



I can send the files to you if you want (but only private for privacy reasons).

Rien

rienvanham

compares

left = after first write
right = after second write

Mario

When you look at file 1 in the MD panel, the "create date" and "date subject created" should display a time zone if you click into them. These are the only time zones IMatch deals with. It does not care or modify time zones in other tags, including proprietary PDF metadata tags.

Which create date would that be? There are many tags with that name.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

rienvanham

I have after the first save:
---- PDF ----
Create Date                    : 2023:12:22 15:55:00
---- XMP ----
Date/Time Original              : 2023:12:22 15:55:00
Date Created                    : 2023:12:22 15:55:00
Create Date                    : 2023:12:22 15:55:00

After the second save:
---- PDF ----
Create Date                    : 2023:12:22 15:55:00+01:00
---- XMP ----
Date/Time Original              : 2023:12:22 15:55:00+01:00
Date Created                    : 2023:12:22 15:55:00+01:00
Create Date                    : 2023:12:22 15:55:00+01:00

Mario

I produce a PDF file and then set the create date to now, without time zone offset.
IMatch fills "create date" and "date subject created" from that time stamp and adds the local time zone +01:00:

2023:12:13 06:05:04+01:00
I write back and get these timestamps in the PDF file:
[XMP-xmp]      Create Date                    : 2023:12:13 06:05:04+01:00
[XMP-xmp]      Metadata Date                  : 2023:12:23 06:58:01+01:00
[XMP-xmp]      Modify Date                    : 2023:12:23 06:58:01+01:00
[PDF]          Create Date                    : 2023:12:13 06:05:04+01:00
[PDF]          Modify Date                    : 2023:12:23 06:58:01+01:00

All XMP timestamps have the correct time zone offset and ExifTool has copied them also into the native PDF timestamps.
Probably something specific with the timestamps in your PDF files, your PDF files or your File.DateTime settings in Edit > Preferences > Metadata. Since you can "fix" it by writing back twice, I think it is OK.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

rienvanham

Hi Mario,

Thanks for your investigation and suggestion.

And: as told earlier: I can live with it!


My workflow is:
For email I'm using "eM Client" and write every email to a PDF-file (which is a function in eM Client).

The only thing I can imagine that I'm doing something wrong: before the first write I modify "Create Date" and "Date Subject Created" with the creationtime of the email.

e.g.:
an email is received at 2023-12-22 22:11:33 with Subject "test" I write this out as "2023-12-22 22.11 - Test.PDF"
In iMatch I made a metadata-template which writes "Create Date" and "Date Subject Created" based on the first 16 characters of the filename. After that I'm adding "summertime (+02:00)" or "wintertime (+01:00)" to the times.

It looks oké in the metadata-browser but perhaps I'm doing something wrong there.






Mario

Ah. Metadata Template.
Important bit of information. You should have mentioned that earlier.

Do you use the correct formatting when you set metadata tags in your template? That's important.
ExifTool can "interpret" non-standard formats, but the outcome may not what you expect.
See: Use the Correct Format
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

rienvanham

Hi Mario,

Your documentation states:
YYYY:MM:DD hh:mm:ss{optional:time zone +/- hh:mm or Z}

The template (summertime) fills 
XMP::xmp\CreateDate
XMP::photoshop\DateCreated

with:
{File.Name|substr:0,4}:{File.Name|substr:5,2}:{File.Name|substr:8,2} {File.Name|substr:11,2}:{File.Name|substr:14,2}:00+02:00

sample filename: 
2023-12-22 22.11 - Test.PDF
{File.Name|substr:0,4} -> 2023
Colon
{File.Name|substr:5,2} -> 12
Colon
{File.Name|substr:8,2) -> 22
[SPACE]
{File.Name|substr:11,2} -> 22
Colon
{File.Name|substr:14,2} -> 11
Colon
[Fixed text] --> 00+02:00

Combined:
2023:12:22 22:11:00+02:00

I will check this with a newcly created PDF.

Thanks for your help!

Mario

Open the ExifTool Output Panel via the View menu before you write back.
You can then see which tags and values IMatch sends to ExifTool for writing and if the values are correct.
Also check in the Default layout of the Metadata Panel if the two XMP timestamps were set correctly.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

rienvanham

I will do but it needs some time to get enough mails to output to PDF.

Thanks so far!