How big is the XMP-Area of a file normally?

Started by voronwe, November 08, 2023, 12:19:02 PM

Previous topic - Next topic

voronwe

Hi
I'm working on a Plugin for Photoshop, which also stored Metadata in the file.
I got a bug with a file which has a huge load of Metadata (around 600.000 lines in the <rdf:Bag> segment)

This is a known bug in older Photoshop-versions, fixed in January 2019
https://community.adobe.com/t5/photoshop-ecosystem-discussions/tif-files-are-saving-out-huge-a-lot-of-xml-is-attached-to-them/td-p/10520789  
https://prepression.blogspot.com/2017/06/metadata-bloat-photoshopdocumentancestors.html
, however, there are files out in the wild with this problem.

The problem is, when I want to change Metadata there, Photoshop hangs.

Because the writing of Metadata is only a kind of benefit to the user (so when reopening the file again, some settings are allready there), my simple solution would be just to say if the Metadata is too big, just do not write.

The question is only: "What is too big?"

@Mario: do you have any experience, what the normal size of Metadata is? (With a buffer). To my mind comes e.g. Lightroom, where the settings can also be stored in the Metadata.

Thanks

Mario

There is no real limit, I believe.
If a user configures Lr / Ps to record every brush stroke in the Adobe XMP namespace, the XMP segment can become really, really huge. Sometimes the XMP data is larger than the actual image...!
Depending on the XML/XMP parser / toolkit involved, this can break things. 600,000 lines is a lot if an XMP parser is used that has to load and hold the entire document in memory. SAX parsers can usually handle this, XPath parsers not.

If you use the Adobe XMP toolkit or some functionality in the Photoshop API, you are bound by it's limits. And if it fails to load XMP data, that's how it is. Maybe there is a function that tells you the size (line count?) and you just do some tests to see when it breaks and then display a "Too big" message before crashing Photoshop?

I haven't used the Adobe XMP toolkit for many years so I'm not able to give specific tips.
One could reason, though, that if an Adobe software used the XMP toolkit to write the XMP data, it should also be able to read it again!?. Or, at least not block Photoshop for an extended period of time.

When I recall correctly there were settings / options that limited how deep the toolkit goes down into nested structures. Maybe there are also settings that limit parsing of excessively large bag containers, no idea, sorry.

voronwe