Warning when conflict between embedded metadata and sidecar metadata

Started by KimAbel, June 24, 2014, 12:42:41 PM

Previous topic - Next topic

KimAbel

Hello

I have mentioned this before in another post, but decided that this deserves a feature request.

This is related to the same problem other have when they are using the default metadata 2 settings and at the same time have embedded metadata in raw files.

For instance:

https://www.photools.com/community/index.php?topic=2309.msg17242#msg17242

https://www.photools.com/community/index.php?topic=2240.msg14886#msg14886

I have also seen users in other posts having the same problem.

It would be very useful to have IM issuing a warning when a user is trying to write metadata in sidecar files (for raw files) and this metadata comes in conflict with embedded metadata. As it is now IM is dismissing the edits that the user is trying to write. The average user will not know why IM is dismissing the edits and will have to spend a lot of time finding the cause of the problem and a solution. This problem ends up in the user forum and you have to answer this many times.

The message could state that there are a conflict between the embedded and the sidecar metadata and point to a solution in the help file (for instance delete embedded metadata). Since not all files will have this problem it would also be nice if IM could display the files with this problem so that the user could do the necessary steps on only the relevant files.

Kim Abel

ovrevid

+1

Following the discussions here on the forum I was able to figure things out, but I agree with Kim that this would be most welcome. Maybe even a "Find files with conflicting metadata" search-function.

Vidar
-- Vidar

Mario

Having both embedded XMP and XMP in a sidecar for the same file is unusual and can be considered an error.

Adding such a check is possible but will complicate things even more for the already highly complicated write-back test.
And what to do if the write-back is in the background? Popup a message for every file?
What to do to if the write-back happens during the ingest phase?

I'm not sure how many users will ever be affected by multiple competing XMP records for the same file. If you use software or a camera which embeds XMP into the RAW file, configure IMatch to read/update the embedded XMP. Or configure IMatch to favor XMP data in the sidecar file. XMP data which may be embedded in the image file is then ignored.

Richard

Quote from: ovrevid on June 24, 2014, 02:39:16 PM
Maybe even a "Find files with conflicting metadata" search-function.

Hi Mario,

I have no idea how many IMatch users will have competing XMP in a file but I am sure that a majority will have "conflicting metadata" in old files. Thus I think that Vidar's suggestion is a good one if it is not that hard to achieve. For the most part this would be a utility that is used one time to find metadata messes, so they can be fixed, and for this reason I question it being a search-function but I would like the capability.

jch2103

Quote from: Richard on June 24, 2014, 05:25:05 PM
...
I have no idea how many IMatch users will have competing XMP in a file but I am sure that a majority will have "conflicting metadata" in old files. Thus I think that Vidar's suggestion is a good one if it is not that hard to achieve. For the most part this would be a utility that is used one time to find metadata messes, so they can be fixed, and for this reason I question it being a search-function but I would like the capability.

Having been in the situation of having problem metadata in old files, I support the suggestion. I suspect a lot of new users will be in the same boat, some with only a few issues, but some with a lot. I discovered a long time ago that you only discover how many problems are in your data when you have good database tools available to work with the data. And IMatch is a very good database with excellent tools.

Some of the cleanup could be done with a packaged ExifTool scan for errors, some of it might need to be done a different, more elaborate way. I think the tough issue would be what to include in the scan. 

A tool like this would identify files with various metadata problems and ideally also suggest how to fix them. It could significantly reduce future support demands and the number of potentially unhappy users.
John

Mario

I'm unfortunately in no position at this time now to sit down and concept, design and implement such a tool.

My work-plan is already set for the next couple of weeks. I'm totally occupied with providing support here and per email, rolling out IMatch, fixing bugs like hell. And I have still about 10 DUMP files to analyze, each of which can take days.

I think a script which runs ExifTool as needed to test if and what metadata a file contains would be the most dynamic solution. The script would have to run ExifTool, direct the output to a text, JSON or XML file, and then do whatever analysis is due. Maybe even offer repair options. This will be not fast, but the script needs to be run only once for a given set of files.

The potential things which can be wrong in metadata are endless. From IPTC data in the wrong character set to duplicate XMP records, damaged EXIF data, unwanted rating= 0 written by a camera to invalid proprietary namespaces in XMP. That, in combination with the options a user chooses in IMatch to process metadata creates a wide range of potential issues.

IMatch combines metadata from embedded XMP and side car files (unless you disable this) at the lowest level, right in ExifTool. IMatch gets only to see the outcome of the merge operation and all the other operations like running the args files. What IMatch loads in the end is an XML file which results from running all the import operations on the image file and potential sidecar file. This is too late in order to perform any sort of "tests".

The reason for the "favor XMP sidecar" option available in IMatch, for example, is exactly to overcome problems caused by disagreeing XMP records embedded in the file and the sidecar file. A user can decide if he wants to merge, or favor the sidecar file or the embedded XMP. One of the many work-around options in IMatch.

Maybe somebody starts a thread in the scripting forum, collecting problems encountered so far, how to detect them, how to fix them. Or if they cannot be fixed, offering some advice. We may then be able to forge that into a script.

The ExifTool web site / FAQ / Forum has tons of threads on various metadata problems and how to repair them via cunning ExifTool magic...


medgeek


Ger

I support the request. I have been struggling with legacy data in various metadata fields when starting to work with IM5.

Somewhere in the help file it states:
QuoteThe Metadata Mess
and that's very true.

Mario writes that there are many possibilities for reporting a conflict. True, and the more detailed the better, but looking back to my various clean-up actions, a first step could be a simple check on the existence of legacy metadata (e.g. https://www.photools.com/community/index.php?topic=2025.msg12832#msg12832) in a non-supported position, compared to the chosen IMatch settings.

Ger

ubacher

I too was struggling with the cleaning up of my metadata.

My suggestion would be a utility which fixes all files which are not MWG compliant - i.e. to make them the way they should be
if the user leaves the default settings in Metadata2.

I think that this would eliminate a lot of help requests.

My simplified, picture oriented approach would be:
Assuming Imatch has read in all the metadata:
For each file (RAW files mostly?)
    Issue an exiftool: List All Metadata command
    Analise the output
    For each entry which should not be there (according to Metadata2 default settings i.e. MWG compliance)  issue a delete request.




Erik

I think scripting is the best way to go about this.  But there are a lot of possibilities.

First, I would imagine a user looking to identify files with XMP embedded and in sidecar files.  There won't be a conflict if there is only one or the other.  If a file has both types of XMP records (embedded and sidecar), I would think it would work best to just throw them into a collection (via scripting).  That could be relatively simple.  The ECP could then be used for the rest.

Then, I think the user should really just look at what XMP data is best for the user and probably utilize the ECP to remove the XMP from the image that is not "correct" or is a duplicate.  The ECP can also easily be used to copy XMP data from one spot to another or even remove it.

KimAbel

Just bringing this topic back for a quick question.

Solving conflicting metadata is the most timeconsuming job for me at the moment. I obviously has many files with this problem and I have found conflicting metadata both in my cr2 files and in the xmp sidecar files. A easy solution to solve this in IMatch would be much appreciated.

In my case I have a set of metadata that I work with all the time (Description, City, State/Province, Country, ISO Country code, Credit, Copyright, IPTC core Creator fields and keywords).

Could one possibility be to have IMatch to bring up a choice to delete all the content in the conflicting metadata fields when there is conflicting metadata (just in the fields that I am trying to edit)? As it is now I have to manually find which fields that have conflicting metadata in each file and then delete these fields. This is very time consuming and many times I dont see that IMatch reverts to the "old" metadata instead of the new metadata I am trying to write. It would be so much more convinient to have IMatch to do this as a clickable choice after conflicting metadata have been found. I have deleted iptc and xmp from many of my cr2 files, but there are other metadata fields to which causes conflicting metadata (both in cr2 and xmp sidecar) so I dont see a solution to delete some fields in advance as a solution.

If this is a possibility I can create a new post with this as a feature request, but for now I just thought this was an elaboration on this old topic.

Kim Abel

Mario

Conflicting metadata is always an exceptional problem.

It's not clear from your post where the conflicting metadata comes from. Do you mean conflicts because the legacy IPTC or EXIF data embedded in the CR2 has different values than the corresponding XMP tags in the sidecar file? Or do you have both an embedded XMP record in the RAW and a XMP sidecar file with values?

The most easy solution is usually to strip the legacy IPTC data from the image and use only XMP. There is a ECP preset for that.

KimAbel

Strip xmp and iptc from my raw is easy, but theres also several other tags that need to be stripped. The most common in the cr2 is exif:Copyright, exif:ImageDescription, exif:Label and exif:Rating.

Then theres the xmp sidecar which sometimes also has conflicting metadata. I dont remember all of those, but keywords and Description is the most common causes (xmp dc:Subject, xmp tiff:Description, xmp lr:HierarchicalSubject and some more). The point is that it is so difficult to have a clear overview of which fields that are causing this. So thats why it would be extremly nice to have IMatch to solve these cases  :) For instance to have an option to delete the tags with the conflicting metadata at the moment of writeback.

I know I can let IMatch write to the cr2 file, but I dont want IMatch to do that (mostly because I dont want to backup the raw each time I do an edit)(I dont have any problem with deleting conflicting tags in the raw since this eliminates future conflicts and only need to be done once).

My jpgs are a lot more difficult to handle, but I have learned that it is easier to edit the metadata in my raws and then to read metadata back again in Lightroom and then export new jpgs.

Its only IMatch and Lightroom that touches my files.

Kim

Mario

I still don't understand how there can be conflicting metadata...
IMatch implements the mapping rules standardized by the MWG, so it will map certain IPTC and EXIF fields into XMP, and does the reverse when XMP is updated. There can be no conflict unless you deliberately let IPTC/EXIF and XMP get out of synch.

KimAbel

Part of the problem is an earlier setting on my database (by some mistake) where I had IMatch write to the cr2 file. This setting is now disabled. So each time IMatch finds conflicts in the cr2 I get into this trouble. My solution has been to delete iptc and xmp from the cr2, but still there are some fields that need to be deleted (se in my last reply). I am currently deleting all my known fields in the cr2 which causes conflicts, but this is time consuming and hopefully only needed once.

Why the xmp sidecar has conflicting metadata I dont know! As I said its only IMatch and Lightroom that touches my files and IMatch is the only one where I edit metadata, except labels and ratings. I can send you a sample file the next time I find a file with this problem.

My metadata 2 settings and the metadata tags that I edit (plus keywords) is shown in the attachment.

Kim

[attachment deleted by admin]

Mario

I don't think that discussing your specific metadata problems belongs into a feature request.
And when you don't let IMatch write to CR2 files which contain IPTC/EXIF data, you deliberately prevent IMatch from synchronizing metadata - which creates conflicting data...but we have that discussed in numerous threads, there is a FAQ on this, it's in the help etc.

KimAbel

I posted it here because it has to do with a wish for a solution for conflicting metadata. My example in this last reply only shows what problems this causes and perhaps why.

The conflicting metadata in my cr2 is only one part of this and it sure would be nice with a warning about found conflicts, and a way to fix this in the same step. As it is now I have to carefully look into my files at a later time (a minute or two) because IMatch, in cases with conflicting metadata, deletes my edits and replaces it with the "old" metadata in the conflicting tags. This happens not immediately so it is often unoticed, which is not good. This surely is not something only I have trouble with.

I still has conflicting metadata in the xmp sidecars and as far as I know I dont have any settings that prevents me from synchronizing this. Thats also why a "simple" solution like deleting iptc and xmp from my cr2 dont help much.

I will not trouble you more with this if you are of that opinion that this is a minor trouble for most of the users, but if so I must disagree.

Kim Abel

Mario

For IMatch there are no conflicts.

On ingest, unless you change the options, it imports existing IPTC and EXiF data into XMP. On write-back, it re-generates mapped IPTC/EXIF tags from the XMP that is written. If you don't allow that, IMatch can only show up a message like

"Since you don't allow IMatch to update the existing EXIF/IPTC data, the XMP data will be, entirely, or partially in conflict and on re-import newer XMP data will be replaced by older IPTC/EXIF data".

I think there is message of that sort already, and it's made very clear in the help.

I don't think that me spending time implementing a feature which compares every XMP tag with every EXIF/IPTC/GPS/PDF tag etc. that may have different content is well spent. First of all, ExifTool implements a lot of logic in the mapping division that I have now knowledge about. Second, the MWG specifications are full of inconsistencies, MUST; SHOULD and CAN conditions. And of course all this is only a problem if the user does prevent IMatch from doing it's job by disabling the update of existing IPTC/EXF data to produce mismatches between XMP and the other data - which only a few users do and probably will regret at some point in the future.

If you really need such a thing. it's easy to write a small script which compares the IPTC/XMP data in the database before the write back. This will tell you which IPTC tags should be updated, and which will cause problems if you don't let IMatch do this. You don't even need to write a script for this, you can just put the corresponding variables into a metadata panel or a custom HTML template...

KimAbel

Thanks Mario for pointing to a solution  :)

QuoteIf you really need such a thing. it's easy to write a small script which compares the IPTC/XMP data in the database before the write back. This will tell you which IPTC tags should be updated, and which will cause problems if you don't let IMatch do this. You don't even need to write a script for this, you can just put the corresponding variables into a metadata panel or a custom HTML template...

I am very happy with a solution where I can let IMatch identify which files that needs special care (by determining which files that has conflicting metadata, and which tags need to be deleted).

Then the only problem is that I dont have any clue on scripting, so I can only hope that someone else can make such a script.

I did not understand how a metadata panel could do this. As I understand it, I then have to know in advance which tags that potentially can have conflicting metadata, and then compare them manually before writeback. This is also very time consuming.

Quoteit's easy to write a small script which compares the IPTC/XMP data in the database before the write back.

The cr2 files is not so much the problem since deleting iptc and xmp from these files is easy (together with some exif fields that I know about, and I have already done that for many of my files). The most challenging job is conflicting metadata in the xmp sidecar. Is there any way to set a Metadata 2 setting so that editing embedded iptc/xmp/exif is not allowed for cr2, but for the xmp sidecar this is allowed? If so I think the problem should be solved  :)

Sorry for asking so many questions on this, but it is really important for me to find a good solution. This part is the one step that steals the most time from me in my daily workflow!

Kim

Mario

Doing what you ask to do would require IMatch to re-implement all the logic that's intrinsic in ExifTool plus the entire mapping section of the MWG standard document. And this only for the specific case where a user deliberately lets his metadata go out of synch. I doubt that I will find time for that.

KimAbel

Ok. That I understand can be difficult.

I guess that switching back to the default for cr2 files (allow "write IPTC", "write EXIF" and not to "allow create IPTC/EXIF/GPS") after I have deleted the xmp/iptc and the troublesome exif data from my cr2 dont solve the problem either, because if IMatch finds a iptc or exif tag in the xmp sidecar that does not exist in the cr2, IM will encounter conflicts (because it has to create this tag in the cr2)?


So the only option to deal with this (since I want to keep my cr2 clean) would be:

1 - Hope that someone can write a script that checks for conflicts

2 - Disable mwg compliance (not good)

3 - Keep going as I have been doing, and delete every tag that I find with conflicts

4 - Allow to write to cr2 (not desired)


Thanks for your patience Mario  :)


Kim Abel

Mario

This is usually all pretty easy...

When IMatch imports a file, it imports existing IPTC/EXIF (from your CR2) into the XMP record. It also merges an existing XMP sidecar file into the mix. That's what goes into the database.

When you change XMP metadata in IMatch that needs to be mapped to IPTC/EXIF, IMatch writes the changes to the XMP sidecar, and then updates the IPTC/EXIF record in the CR2. No conflicts, data properly synchronized, no problem.

The important bit: If your files contain IPTC/EXIF data, let IMatch update it. Else you will have to live with mismatches, conflicting metadata etc. To avoid exactly that I have devoted months of my life to get all this mapping logic working. Just let IMatch do it.