Extract Embedded Metadata from Any File Type

Started by monochrome, March 23, 2015, 05:28:34 PM

Previous topic - Next topic

monochrome

As it is now, IMatch will only extract embedded metadata from some file types. ExifTool, which is what IMatch uses to extract embedded metadata, can extract embedded metadata from many more file types. Thus, IMatch may in some cases not extract metadata from a file even through ExifTool can do so.

This proposal is to make it possible to add types to the list of file types that IMatch will attempt to extract embedded metadata from.

Mario

As I said in your original post, IMatch currently uses fixed "tag tables" which are extracted from ExifTool when a database is generated (or when a new ExifTool version is installed).

What your idea requires, is that IMatch is modified to also handle the arbitrary on-the-fly tag names ExifTool produces in your example. ExifTool just "makes up" tag names from the names of the nodes found in the XML document. There can be obviously any number of such tag names, depending on which files you feed to ExifTool.

The syntax of these tag names is not as stringent as the regular tag names, which can cause all kinds of problems when IMatch tries to import these names into the database. And this will also create problems with the syntax used for tag names inside IMatch, e.g. in the Tag Manager, the MD panel, all other panels which work with metadata tags. And of course arbitrary tag names may break the variable system in IMatch, because IMatch needs proper production rules to convert tag names into variable names.

Implementing this will be probably very, very expensive.

monochrome

Quote from: Mario on March 23, 2015, 05:37:04 PM
ExifTool just "makes up" tag names from the names of the nodes found in the XML document. There can be obviously any number of such tag names, depending on which files you feed to ExifTool.

I agree that if that were the case, this feature would be an impossibility. But as I just posted in our previous discussion, this is not the behavior I see from ExifTool. As far as I can see, ExifTool will only output XMP data from proper XMP files. In any other case, it outputs an error.

Mario

This is just because you copied down a valid XMP RDF record from Wikipedia, saved it to an XML file. My guess is that ExifTool recognizes the RDF namespace and then handles it like an XMP file. Try ExifTool on an arbitrary XML file which is not an XMP file in disguise.