Metadata template and strange characters

Started by jmsantos, July 05, 2021, 07:23:02 PM

Previous topic - Next topic

jmsantos

I have prepared a template with author data in IMatch. I write this data to TIF, JPG and RAW files (NEF and RAF). Everything is OK. However I have seen that there are differences in the presentation of those data in the file properties tab in Windows (attached screenshot). In the TIF and JPG images there is no problem, but in the RAW image some characters are not correct, such as vowels with accents and the Copyright © symbol.

Why does this difference occur if the template is the same in the three images?

Translated with www.DeepL.com/Translator (free version)

Mario

Which tags do you write?
Make sure to write XMP data only.
This looks like a character set issue - do these files contain legacy IPTC metadata perhaps?
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

jmsantos

I write XMP data only.
I had not previously written any metadata in the RAW image, only the ones that come from the camera.

Mario

For RAW files, IMatch writes XMP into the XMP sidecar file.
You did not answer my question about legacy IPTC, the most likely cause for this.

Run the Metadata Analyst on the RAW file to check it for problems. The Metadata Analyst
The green button at the top allows you to copy/paste the problems found.

With issues like this, uploading the files in question to your cloud space and posting a link is very helpful.
All else is just guesswork.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

jmsantos

Quote from: Mario on July 05, 2021, 07:45:10 PM
For RAW files, IMatch writes XMP into the XMP sidecar file.
You did not answer my question about legacy IPTC, the most likely cause for this.

Run the Metadata Analyst on the RAW file to check it for problems. The Metadata Analyst
The green button at the top allows you to copy/paste the problems found.

With issues like this, uploading the files in question to your cloud space and posting a link is very helpful.
All else is just guesswork.

Sorry, no IPTC Legacy metadata.

Link to NEF image with XMP sidecar, both in ZIP file:
https://1drv.ms/u/s!AqOdfCDSkB4YsUYz7qSIWRUZDPJL?e=g0BWPO

And Metadata Analyst result.

Mario

Warning: [XMP] Embedded XMP record (NIKON D750 Ver.1.10     ) and XMP sidecar file (photools.com IMatch 20.14.0.2 (Windows)) found.
Warning: [XMP] [ExifIFD]:DateTimeOriginal not mapped to [XMP-exif]:DateTimeOriginal (embedded).
Warning: [XMP] [ExifIFD]:DateTimeOriginal not mapped to [XMP-photoshop]:DateCreated (embedded).


Your NEF file has an embedded XMP record and an XMP record in the sidecar file.
As per standard, IMatch updates only the sidecar file on write-back.

Remove the embedded XMP record from the NEF file.
Run ExifTool Command Processor with the "Remove XMP metadata".
Then retry.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

sinus

Quote from: Mario on July 06, 2021, 08:35:16 AM
Warning: [XMP] Embedded XMP record (NIKON D750 Ver.1.10     ) and XMP sidecar file (photools.com IMatch 20.14.0.2 (Windows)) found.
Warning: [XMP] [ExifIFD]:DateTimeOriginal not mapped to [XMP-exif]:DateTimeOriginal (embedded).
Warning: [XMP] [ExifIFD]:DateTimeOriginal not mapped to [XMP-photoshop]:DateCreated (embedded).


Your NEF file has an embedded XMP record and an XMP record in the sidecar file.
As per standard, IMatch updates only the sidecar file on write-back.

Remove the embedded XMP record from the NEF file.
Run ExifTool Command Processor with the "Remove XMP metadata".
Then retry.

I wonder.
For ALL my files, taken with the Nikon D750, like in this case, the same errors are displayed, like above, with "Metadata Analyst".
I have not cared until now. But now I wonder.

I have this message for tousends of files, and these files has only be imported into IMatch, raws (nef) just out of the camera.
No other program for sure, no photoshop, nothing, only IMatch.

And only IMatch has created the xmp-files.

Hence there must be something wrong with my preferences.
I want not hijack this thread, I want only say, that I have exactly the same errors.

This whole metadata is really a big mess.
Best wishes from Switzerland! :-)
Markus

jmsantos

I do the same as Markus:

Quote from: sinus on July 06, 2021, 09:20:45 AM
...these files has only be imported into IMatch, raws (nef) just out of the camera.
No other program for sure, no photoshop, nothing, only IMatch.

And only IMatch has created the xmp-files.

The data embedded in the file must have been written by the camera. But none of that data is related to the Author or Copyright fields, which are the ones with the problem I have raised. Would removing the embedded XMP data solve the problem of strange characters in Author and Copyright in Windows properties?

On the other hand, it seems clear that Nikon embeds XMP data in the RAW images in violation of the standard. Ok, very badly done, Nikon. But, to make everything work right I wonder: do I have to worry about deleting that embedded data in a RAW file? What would be the workflow for that? You would have to check if the RAW files have XMP embedded in them, delete that data and write again to make everything correct. Is that right?


Mario

Some Nikon cameras write an XMP record, usually only consisting of one entry: rating=none
This is what causes the warning in the MDA - the XMP record is not standard-compliant because it does not contain the EXIF data, in the XMP namespace for EXIF.

IMatch imports XMP from RAW files, and merges it with XMP data from the sidecar file (if any).
Since the rating=0 problem is common, IMatch ignores it by default.
This is unrelated to any character set issue.

What happens if you don't use a MD template to set this info but enter it in the MD panel and write back?
Please understand that my "metadata mess" queue is full and it can take a week or two before I can spend time analyzing your RAW file and use case.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

jmsantos

Quote from: Mario on July 06, 2021, 10:16:21 AM
What happens if you don't use a MD template to set this info but enter it in the MD panel and write back?
Please understand that my "metadata mess" queue is full and it can take a week or two before I can spend time analyzing your RAW file and use case.

Typing the Author and Copyright data directly in the MD panel and write-back does not change anything, the strange characters still appear in Windows. I have also deleted the XMP data with ECP and it is still the same.

Take as much time as you need.

Mario

#10
The NEF file contains this data with your name:

[IFD0]          Artist                          : José Man...
[IFD0]          Copyright                       : © 2021 José Man...


(shortened for privacy reasons). The XMP file contains:

[XMP-dc]        Creator                         : José Man...
[XMP-dc]        Rights                          : © 2021 José Man...
[XMP-photoshop] Credit                          : José Man...
[XMP-tiff]      Artist                          : José Man...
[XMP-tiff]      Copyright                       : © 2021 José Man...
[XMP-xmpDM]     Artist                          : José Man...


which looks perfectly OK to me. All data has been properly written.

Affinity Photo, IMatch, Lightroom and Photoshop show the data correctly.

Windows Explorer does not use the XMP file but looks at the NEF file
Your name, containing the é appears mangled as



which is probably caused by Windows Explorer messing up the character encoding.
This is the risk with using non-ASCII characters in EXIF data - there is no real support for character sets and special characters may be interpreted differently by different applications.

I recommend you contact Microsoft support and ask them how to handle this, or fix the bug in Windows Explorer, when it is one.

The RAW file contains a rudimentary XMP record with only two fields. Nikon didn't even bother to produce the most basic XMP data. Sad.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

jmsantos

Thanks Mario.

I will try to check with Microsoft Support, it doesn't seem to be an easy task. I don't have much hope of getting an answer either.

In the meantime, I throw the original post question back in the air: if it's a problem with File Explorer character encoding in Windows, why does it occur only in embedded data of RAW images and not in JPG or TIF?

The metadata for the three image types have been written with IMatch, which will have transferred the data to the same Exif and XMP fields.

Mario

QuoteFile Explorer character encoding in Windows, why does it occur only in embedded data of RAW images and not in JPG or TIF?

Since the data shows correctly in IMatch, Photoshop, Lightroom, Affinity etc. - I recommend you direct this question to your Microsoft support representative.
I have not written Windows Explorer nor have I any insight in how it works.
All I know is that it is usually terrible with anything regarding metadata.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

sinus

Quote from: Mario on July 10, 2021, 10:22:25 AM

which is probably caused by Windows Explorer messing up the character encoding.
This is the risk with using non-ASCII characters in EXIF data - there is no real support for character sets and special characters may be interpreted differently by different applications.



True.
It has a reason, why a lot of people (and also photographers) does use ASCII-characters.
It is very sad, but if I use for exampe my name "Hässig" - I will end up very often with curious characters, problems and so on.
The same with emails ...

The problem is also, for some programs or websites, these special characters work, hurrah, but then for the next it creates a mess.

Hence I use better "Haessig". Very sad, it is a kind of "rape of words" and names, but finally it is like it is, unfortunately.


Best wishes from Switzerland! :-)
Markus

Mario

All these problems would be a thing of the past when camera vendors would get rid of the ancient EXIF metadata.
Modern cameras often write a tiny rudimentary XMP record, usually only containing "rating = 0" and, sometimes, the camera maker.
Then they write all the rest of the data in the ancient EXIF format.

Why? Don't know.

Why don't they just write a real XMP record, which also covers all the EXIF data. But in modern XML-based format. Using the UTF-8 character set which handles all the 64K characters of UNICODE.
Including José and Hässig. And アイマッチ大好きです

XMP has no character set issues, no length limits for text, proper time-zone handling, all would work better.
Why camera vendors still use the 30 year old EXIF format, I have no idea.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

jmsantos

After waiting a reasonable time, I don't think I'm going to get any information from Microsoft on this issue.

As noted here, the problem would be in the character encoding allowed by the (obsolete) EXIF standard. When I write the metadata "Creator: José" in a RAW file with IMatch, it transfers that data to the appropriate fields, as recommended by the MWG. In RAW images some of that data goes in XMP sidecar and some is embedded in the file itself. For example, in "[IFD0] Artist: José", which in Windows properties or FastRawViewer is seen as "José".

As far as we know, the solution adopted by some programs (e.g. Adobe) is not to write this EXIF data in the RAW files and avoid the problem. It should be remembered that Adobe and Microsoft were part of the MWG that established the recommendations on metadata interoperability but do not follow their own guidelines, although that working group seems to be no longer operational (at least its website is abandoned).

Another MWG member was Canon, and I have found that their Digital Photo Professional (DPP) software resorts to a solution that I think is clever: when I type the data "Creator: Jose", the program transforms the "é" character only into the EXIF metadata:

[IFD0] Artist: Jose
[XMP-dc] Creator: José


It seems to me an elegant solution that perhaps IMatch could adopt, even if it does not strictly comply with MWG guidelines.

Translated with www.DeepL.com/Translator (free version)

jmsantos

In the previous post I meant:
When I type the data "Creator: José"...

sinus

If this problems is only for some names like José and so, maybe you could create a Metadata-template, where you let write Imatch these name in the different fields on a different way?
Would be easy, I think (for example during the import into IMatch or simply later).

Of course, if you use a lot of different words, than this would not be a good way.
Best wishes from Switzerland! :-)
Markus

jmsantos

#18
Quote from: sinus on July 30, 2021, 03:48:44 PM
If this problems is only for some names like José and so, maybe you could create a Metadata-template, where you let write Imatch these name in the different fields on a different way?
Would be easy, I think (for example during the import into IMatch or simply later).

Of course, if you use a lot of different words, than this would not be a good way.

I have tried, but IMatch does not allow to edit the EXIF:Artist tag or use it in a Metadata template.
Maybe it could be done with ECP? I would have to learn Exiftool commands.

Mario

The MD template will set any tag, but during write-back, ExifTool maps from XMP to EXIF, and the Artist tag is one of the tags to map.

ExifTool (and IMatch via Edit > Preferences > Metadata) allow to specify a character set for legacy IPTC and EXIF - for read and write, separately!
IMatch gives you access to the character sets mentioned in the ExifTool FAQ: https://exiftool.org/faq.html#Q10

But character sets are messy and if the "reading" application handles them at all, and correctly, depends on whatever is reading the legacy IPTC and EXIF data.
May take some experimenting on your side.

Legacy IPTC has been declared legacy 20 years ago.
EXIF is 30 years old.
XMP (which has none of the problems due to consistent use of the UTF-8 character set) is available for, what, 15 or 20 years.

Despite all that, camera vendors still write EXIF data (and occasionally an XMP record with "rating=none", just to show that they know what XMP is but don't care).
And this means we all have to live with the limitations and ancient concepts like character set - forever.
Basically, after IMatch has written your file once, the XMP data contains all important EXIF data (except proprietary maker notes, which have no place in XMP for good reasons).
You could strip the EXIF data from the file and solve all problems.

The same can be said about legacy IPTC data.
Which is why I stripped it from all my files (~80,000 files affected, when I recall correctly) and never looked back.
XMP can do all legacy IPTC could - just better.
Same for EXIF.

Modern cameras are basically smart phones with better optics.
Why the camera vendors still use EXIF data instead of XMP is something only the Japanese developers fathom...
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

jmsantos

Quote from: Mario on July 31, 2021, 08:03:05 PM
The MD template will set any tag, but during write-back, ExifTool maps from XMP to EXIF, and the Artist tag is one of the tags to map.

Oh, I thought the warning (see image) meant it was not allowed.

Anyway, if Exiftool maps that data, there is little we can do. Unless, like Canon DPP does, Exiftool "translates" the non-ASCII characters and writes "e" where it says "é" only in the EXIF tags.

The other option is not to use non-ASCII characters, misspell my name "Jose" and instead of © write (C).

Thanks, Mario and Markus.

Mario

The problem is not what characters are written to the EXIF.
The problem is the character set the EXIF is marked with - and which tells the reading application how to interpret the characters.
See me links above.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook