Reading and (pending) metadata write-back recycle

Started by mastodon, March 21, 2016, 08:00:24 PM

Previous topic - Next topic

mastodon

I have some 500 pics in a small database to learn the features of IMatch. I have imported them, and after that it read metadata, then I saw pending metadata. I have let metadata write-back for all pending files. It has done in minutes, and after that I saw IMatch reading metadata.  After it has finished and saw pending metadata write-back again. And so on... List of tags to write: IPTC::ApplicationRecord\Keywords XMP::dc\Subject - I have IPTC records mad with Imatch 3 and other software (Irfanview).
What can I do for finishing the cycle?

Mario

Please give us more details to work with.
Your Edit > Preferences > Metadata and ...> Metadata 2 settings (unless you use the defaults).
Your thesaurus setup etc.

It seems your settings together with your file contents create 'new' XMP keywords on every import, which then causes the write-back and re-import...

This may happen under obscure conditions only so we need to look at the data in your files, the settings you use, the thesaurus contents etc.

You may also try to set things straight by deleting the IPTC keywords in some of your files for a test, via the ExifTool Command Processor in IMatch.

1. Select some of the files which cause the problem in a file window.
2. Open the ECP via <F9>,<E>
3. Enter these statements

-G1
-IPTC:Keywords
-XMP:Subject
-XMP:HierarchicalSubject
{Files}

and then copy the data shown on the right into Notepad, save to your disk and attach to your reply. This will show us the keywords in your files.

4. Enter these commands to delete the IPTC keywords:

-overwrite_original_in_place
-iptc:keywords=
{Files}

5. Close ECP and check if the next write-back to that file 'sticks'.



-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

mastodon

I have the problem with the hierarhichal keywords. So, I had this problem last year, you had helped, and I could managed it. :)
Now:I have all the default settings. I have keywords with ".". I have cleared all hierarchy separator in Edit > Preferences > Metadata Keyword import and unchecked write hierarchical keyword.
1. After 3 cycle of metadata reading and rewriting it does not changed.
2. If I delete one | separator in the metadata panel, it is OK, the problem solved for that picture.

I have to remove from about 400 files the "|" separator. How can I do it? Is there a search and replace function? I am a newbie in IM5.

Mario

I'm not sure that I understand what you need to do. It would help lots if you can just attach one of your files so we can see the embedded metadata. Or send me a file to my support email.

When you import a file into IMatch, IMatch checks for embedded IPTC, EXIF and XMP data and imports it. During that process, it maps existing 'flat' keywords contained in IPTC and/or XMP into hierarchical keywords.

If you have used non-standard keywords in IPTC (you mean "." so I guess you have keywords with an embedded hierarchy, e.g., "location.beach.hawaii") IMatch can use that hierarchy when creating the real hierarchical keywords. You need to configure this under Edit > Preferences > Metadata. Set the hierarchy separator to .

After the keyword import is complete, IMatch has one set of hierarchical keywords for your files. During write-back, it replaces existing IPTC keywords from these keywords. There should be no problem at all, unless your files contain uncommon data or you have configured IMatch in non-standard ways.

1. Supply a sample file
2. Show a screen shot of Edit > Preferences > Metadata
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

mastodon

Mario, I have sent you an email with that files.

Mario

Thanks. I usually get between 30 and 50 emails per day and it may take a couple of days before I can look into your email.
Did you include a link back to this thread in your email?  If I need to figure out the thread to which an email belongs, it takes longer to process it.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

mastodon

Sorry, Mario, I resend that email with a link to this topic.

Mario

#7
Hi,

I looked at the sample file you've sent.
I looked at the metadata in the file. It contains EXIF, legacy IPTC, Canon maker notes, and XMP metadata.
When IMatch imports the file, it marks it for write-back and shows "XMP subject" as the tag to write back (in the tooltip of the pen icon in the file window).
I thus looked at the keywords in the file. It has three sets of keywords, but they are not synchronous:

[IPTC]          Keywords                        : Vértényi Richárd, Vértényi Jenõné
[XMP-dc]        Subject                         : Vértényi Richárd, Vértényi Jenõné
[XMP-lr]        Hierarchical Subject            : dr| Vértényi Jenõné, Vértényi Richárd


The legacy IPTC keywords seem to have a character set issue. Maybe the legacy IPTC data was written in a local code page (Hungarian) but not marked as such. This is sometimes the case when files are processed by multiple applications.

Now I let IMatch write-back the file by clicking the pen. IMatch writes back the new metadata generated during the import and tries to straighten everything out. Now the keywords look like this:

[IPTC]          Keywords                        : Vértényi Richárd, Vértényi Jenõné, Vértényi Richárd, dr| Vértényi Jenõné
[XMP-dc]        Subject                         : Vértényi Richárd, Vértényi Jenõné, Vértényi Richárd, dr| Vértényi Jenõné
[XMP-lr]        Hierarchical Subject            : dr| Vértényi Jenõné, Vértényi Richárd


The 'mess' has propagated into the flat XMP keywords.
IMatch still flags the file as needing a write-back. This time both XMP and legacy IPTC keywords are listed as needing a write-back.
I write back again by clicking the pen.

Now the file is no longer marked as needing a write-back. The keywords now look like this:

[IPTC]          Keywords                        : Vértényi Richárd, dr| Vértényi Jenõné
[XMP-dc]        Subject                         : Vértényi Richárd, dr| Vértényi Jenõné
[XMP-lr]        Hierarchical Subject            : dr| Vértényi Jenõné, Vértényi Richárd

The IPTC data is fixed, and the keywords are synchronized between all three metadata formats. IMatch has done its best to fix things.

The problem seem to come from the legacy IPTC data in the file. Which is incomplete and apparently written in a non-standard character set. This is always a problem with legacy IPTC data - there is no way to specify the code page in which the data is written. IMatch has options in Edit > Preferences > Metadata to handle this, but it's complicated and generally not worth the trouble. IMatch 3 (and IMatch 5 of course) did always write legacy IPTC data in UTF-8 encoding, because this is a proper way to do it and to mark the IPTC data as UTF-8 encoded. This avoids all the problems with country-specific character sets. But IPTC data is history anyway.


I've made another test, using the Delete IPTC data preset in the ExifTool Command Processor in IMatch. This removes legacy IPTC data from a file.
Then I made a forced rescan of the file using <Shift>+<Ctrl>+<F5>. This removes all traces of legacy IPTC data from the image file and the IMatch database. Since the file seems to have a proper and complete XMP record, no data seems to be lost by this step.

IMatch marks the file as pending, because the hierarchical XMP keywords and flat XMP keywords are not synchronized yet. They are

[XMP-dc]        Subject                         : Vértényi Richárd, Vértényi Jenõné
[XMP-lr]        Hierarchical Subject            : dr| Vértényi Jenõné, Vértényi Richárd


The XMP subject has the same keyword two times, but the hierarchical keywords show one keyword with a hierarchy dr|...
I now write back the file by clicking the pen. IMatch writes back the metadata, synchronizes the keywords and re-imports the file. Now the keywords are in synch:

[XMP-dc]        Subject                         : Vértényi Richárd, dr| Vértényi Jenõné
[XMP-lr]        Hierarchical Subject            : dr| Vértényi Jenõné, Vértényi Richárd


and the file is clean. No more legacy IPTC data, only proper EXIF and XMP metadata.

Results:

1. You should solve the problem by rewriting each file at least twice. But the old IPTC data in your files may still be in the wrong character set. Look into Edit > Preferences > Metadata options and (important!) the corresponding help topic to learn more about legacy IPTC data and character set troubles.

2. You can remove legacy IPTC data from your images as shown above. Try with some test images. Then look at the metadata remaining in the file and make sure everything you want is still there (in XMP, not in the old IPTC data).

Important: Force a rescan with <Shift>+<Ctrl>+<F5> > Force Update to make IMatch load the new IPTC-free metadata from the file again. Otherwise IMatch will still remember that the file had IPTC data and re-create it on write-back.

This way you rid your files of old IPTC data and only have one set of metadata and a lot less problems.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

mastodon

Thanks for your detailed answer, Mario. I understand. And, you had answered my second problem with character coding IPTC fields, too.  :D I want to keep IPTC fields, and that causes many problems. I might change my mind.
I wonder, is it possible to set the character set for IPTC fields in I Match. That would possible solve the problem, in case after the import the accent character are not shown right. So, I select images, select keyword/subsject or any other field, and set the IPTC character encoding. After that IMatch would show and reimport accent character right.

Mario

1. As I wrote, look under Edit > Preferences > Metadata and read the corresponding help. Legacy IPTC data and non-UTF8 character sets are a minefield, now and even more so for the future. IPTC has been killed 10 years ago, and I wonder why users still want to use it for new files. The Metadata Working Group and the MWG are pretty clear: New files should not contain legacy IPTC data. Which is also the default in IMatch 5.

If you really need to retain legacy IPTC data in your files: Think twice. And if you are really sure, at least convert it into UTF8. This at least solves the problem with your local code page. There is no way to record a local code page in IPTC, but UTF-8. See also the stuff I wrote about that in the IMatch 5 help.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook