Differentiate files with unwritten metadata by reason.

Started by Mike, December 31, 2020, 02:56:22 PM

Previous topic - Next topic

Mike

There are probably at least three reasons why the metadata has not yet been written:

a) The procedure just didn't take place.
b) It did not work because the originals are offline.
c) Or, and that interests me particularly, it didn't work because e.g. the files are damaged.

Such damaged files require special treatment and it would be good to be able to identify them clearly. A dedicated collection for this purpose would be helpful. It could have a broken pen as an icon ;)

As soon as iMatch fails to edit metadata due to file damage (or something like that), such files could be put into the "failure" category. This would make the metadata situation more transparent and our work easier.

Thank you!

PS. Since I have only recently started using iMatch, I apologize if there is a similar feature that I have overlooked.

Mario

Problems writing metadata (damage, in your terms) is extremely rare.
In these rare cases, IMatch flags the files with a yellow warning icon in the File Window and the tooltip of this warning icon contains the error message.

If you really have more than a handful of files with problems of this kind, you can find them easily enough by filtering for the special
You can also filter for files with problems reading/writing metadata using a value filter in the Filter panel, filtering for files with data in the special Extra:Error tag:



Then apply this filter to the scope of your choice. Individual folders. Years. The entire database.

You could also create a data-driven categories based on this tag. This creates a category which contains all files with problems. If you need this more than once.



Furthermore, IMatch 2021 will include a new major feature set which deals with this, among many other things.

Mike

Thanks for the useful tips! I will test them extensively.

The ability to combine functions in iMatch is great. I obviously still have a lot to learn ;-)

The term "damaged" that I used is of course almost symbolic and very general, because in reality there are many reasons why some files are problematic.

I have not yet experienced this with my own photos or design data, but I also work on extensive scientific research projects in which I analyze images from various sources. On average I come across about 200 "faulty" photos per 100,000 examples. This number comes up regularly, so it is very helpful if I establish an efficient recognition procedure. That makes the dataset reliable and saves me trouble later.

Best Regards

Mario

Very good.
IMatch is used in quite a number of 'research contexts', helping people to keep track of things.

If you often encounter problem files, the data-driven category is probably the route to go. It allows you to see all files with problems, organized by problem type (as returned by ExifTool) with a single click. Very handy.

jch2103

Quote from: Mario on January 02, 2021, 04:04:09 PM
If you often encounter problem files, the data-driven category is probably the route to go. It allows you to see all files with problems, organized by problem type (as returned by ExifTool) with a single click. Very handy.

Doing this was very helpful when I was working to clean up problems with a lot of old images that had been handled by a variety of old software programs.
John

Mario

In addition, the Metadata Analyst can help with checking files for metadata-related issues.

Mike

I'm still experimenting with the proposed solutions. I use e.g. a data-driven category with the tag "Extra \ Error \ Error \ 0" which helped me to discover a number of problem files (these also receive a warning sign).

However, other problem files remain undetected, hence also don't receive a warning sign. I found many problem files of that type, where I e.g. cannot remove the label or rating. More precisely: I can do it superficially, but as soon as I press the write button, labels or ratings come back.

Such a problematic file that I got from the Internet produces e.g. the following message:

"Metadata Analyst Results. Version 2020.12.6. 1/3/2021 3:36:11 PM
File analyzed: E:\BILDERWELT\0 Problem File Samples\Problem Beispiele\Jansson, Mikael.jpg
Errors: 0
Warnings: 15

Warning: [Metadata] Warnings: 'Non-standard header for APP1 XMP segment'
Warning: [XMP] [ExifIFD]:UserComment not mapped to [XMP-dc]:Description (embedded).
Warning: [Detailed Validation] Non-standard header for APP1 XMP segment
Warning: [Detailed Validation] ExifIFD tag 0x9010 OffsetTime requires ExifVersion 0231 or higher
Warning: [Detailed Validation] ExifIFD tag 0x9011 OffsetTimeOriginal requires ExifVersion 0231 or higher
Warning: [Detailed Validation] Missing required JPEG ExifIFD tag 0x9101 ComponentsConfiguration
Warning: [Detailed Validation] Missing required JPEG ExifIFD tag 0xa000 FlashpixVersion
Warning: [Detailed Validation] [minor] IFD0 tag 0x0100 ImageWidth is not allowed in JPEG
Warning: [Detailed Validation] [minor] IFD0 tag 0x0101 ImageHeight is not allowed in JPEG
Warning: [Detailed Validation] [minor] IFD0 tag 0x0102 BitsPerSample is not allowed in JPEG
Warning: [Detailed Validation] [minor] IFD0 tag 0x0103 Compression is not allowed in JPEG
Warning: [Detailed Validation] [minor] IFD0 tag 0x0106 PhotometricInterpretation is not allowed in JPEG
Warning: [Detailed Validation] [minor] IFD0 tag 0x0115 SamplesPerPixel is not allowed in JPEG
Warning: [Detailed Validation] [minor] IFD0 tag 0x011c PlanarConfiguration is not allowed in JPEG
Warning: [Detailed Validation] [minor] Missing required JPEG IFD0 tag 0x0213 YCbCrPositioning"


There are several files with the same behavior (lable and rating cannot be removed) but they have different warnings in the Analyst.

Is there a way to edit the data driven category to also identify a file like the todays example? I myself do not know exactly which attributes are responsible for the misbehavior of the file, but maybe it would be a step forward to even be able to list files that contain warnings.

One kind of interesting thing would be to be able to "register" all the problematic files we encounter over the time. Ideally iMatch could learn somehow from them, and later be able to automatically identify such or similar problems on its own.

If we discover them all, then we can proceed to repair, replace, overwrite or whatever needed.

Mario

Warnings in image files are pretty common. IMatch does not keep track of them and they usually don't prevent ExifTool from reading or writing metadata (else it would return an error and that would be recorded).
Rating and label are stored as part of the XMP record (embedded or sidecar). Do you perhaps have both an XMP sidecar file and an embedded XMP record for the files where you see this problem? This would explain this.

Mike

None of the affected files has a sidecar file. As far as I can remember the history of these files, I assume that they must have been problematic a long time ago when they were downloaded. Maybe they were "incomplete" if you are right about the sidecars.

In the future I will make sure that only healthy data reach the database. Even if this type of problem cannot be automatically detected, I can set up a test procedure myself before the files are allowed to be added to the database.

Fortunately, in the meantime I have managed to tame all 500 troublemakers in one way or another.

By the way, iMatch is a lot of fun to work with!

Mario

Quote from: Mike on January 03, 2021, 09:58:52 PM
(...)
By the way, iMatch is a lot of fun to work with!

Happy to hear that. I guess that many IMatch users think the same  ;D
Since I don't have the marketing budget of Adobe (200 million+ per year), tell your friends about IMatch. Maybe they are in need of a useful DAM as well...