Integrity check for DNG files

Started by lnh, October 09, 2014, 03:54:30 PM

Previous topic - Next topic

lnh

Maybe not an immediate need but one for longer term development... DNG files have a feature where the core image data (i.e. the part that doesn't change) has a checksum. The feature request is for IMatch to implement a validation feature to check for corruption in selected DNG files. The concept of data integrity is core to a DAM solution.

jch2103

John

Mario

Why only DNG files? What with all your other files?

And as far as I know, the built-in test functions in LR etc. test only the DNG image data, but not the rest of the file?
And I have several DNG files in my test file collection which are refused by ACR and LR with the typical Adobe error message "Not the right kind of file".
The problem is that these files contain incomplete or damaged metadata... and can be fixed by ExifTool.

IMatch produces a 32-Bit check sum for each file anyway. And keeps this check sum up-to-date every time a file is written and re-imported. Would it not make more sense to add a feature which re-calculates that checksum and then compares it with the checksum stored in the database? I think there is even a scripting method for that, and if not, I can add this easily and write a small script which compares checksum for all selected files and puts the files with mismatching checksums (potential damaged) into a category or something.

pajaro

Quote from: Mario on October 10, 2014, 09:06:26 AM
Would it not make more sense to add a feature which re-calculates that checksum and then compares it with the checksum stored in the database? I think there is even a scripting method for that, and if not, I can add this easily and write a small script which compares checksum for all selected files and puts the files with mismatching checksums (potential damaged) into a category or something.

+1

ChrisMatch

Quote from: Mario on October 10, 2014, 09:06:26 AM
Would it not make more sense to add a feature which re-calculates that checksum and then compares it with the checksum stored in the database? I think there is even a scripting method for that, and if not, I can add this easily and write a small script which compares checksum for all selected files and puts the files with mismatching checksums (potential damaged) into a category or something.
+1

I would appreciate such a feature too.
If a file gets corrupt and I don't notice it
-> this bad file could make its way into all the backups I have :-(

herman

+1

I am not much of a feature-request guy as I think IMatch has already everything I would ever need.

I support this one strongly though.

I still remember when LR crashed on my machine and it took a couple of files with it, damaged them when writing it's housekeeping s**t into my files  >:(
It took me ages to find out which files were damaged and to restore them from a backup.

Things like this happen, and having a tool to check file integrity can be priceless when disaster strikes.
Enjoy!

Herman.

lnh

Quote from: Mario on October 10, 2014, 09:06:26 AM
Why only DNG files? What with all your other files?

And as far as I know, the built-in test functions in LR etc. test only the DNG image data, but not the rest of the file?
And I have several DNG files in my test file collection which are refused by ACR and LR with the typical Adobe error message "Not the right kind of file".
The problem is that these files contain incomplete or damaged metadata... and can be fixed by ExifTool.

IMatch produces a 32-Bit check sum for each file anyway. And keeps this check sum up-to-date every time a file is written and re-imported. Would it not make more sense to add a feature which re-calculates that checksum and then compares it with the checksum stored in the database? I think there is even a scripting method for that, and if not, I can add this easily and write a small script which compares checksum for all selected files and puts the files with mismatching checksums (potential damaged) into a category or something.

Even better!

+1

sinus

Quote from: Mario on October 10, 2014, 09:06:26 AM
IMatch produces a 32-Bit check sum for each file anyway. And keeps this check sum up-to-date every time a file is written and re-imported. Would it not make more sense to add a feature which re-calculates that checksum and then compares it with the checksum stored in the database? I think there is even a scripting method for that, and if not, I can add this easily and write a small script which compares checksum for all selected files and puts the files with mismatching checksums (potential damaged) into a category or something.

Would be great!
Best wishes from Switzerland! :-)
Markus


ovrevid

+1 Should be greatly appreciated by everyone who have experienced damaged image files, me included.
-- Vidar

Mario

I have added methods to create MD5, SHA1 and 32 bit CRC checksums for files to the scripting language. Especially the SHA1 checksum is frequently used to verify validity of files and archives. I use SHA1 checksums for all files you can download from the customerWeb site, for example.

The CRC method produces checksums also created internally by IMatch for every file. I've written a small script which calculates the checksum for all selected files and compares them to the checksums stored in the database for these files. If there is a mismatch, the file is added to a special category for later review. So we now have a File Verifier feature for IMatch.

Unfortunately, I've discovered a bug in IMatch while writing the new script . Under some conditions, when writing back metadata, IMatch did not update the internal checksums for the updated files. This was caused by some internal optimizations, which apparently optimized too much.

I have corrected this for 5.2.8 and also written a script which scans all selected files (all files in your database when you select the Database node) and updates the checksums in the database from the current checksums in the file. If this script has been run once, you can run the File Verifier script later at any time to determine if the files in a folder or a disk are still unchanged.


jch2103

John

ChrisMatch

Quote from: Mario on October 12, 2014, 07:48:23 AMIf this script has been run once, you can run the File Verifier script later at any time to determine if the files in a folder or a disk are still unchanged.
Sounds great  :)

lenmerkel

Quote from: Mario on October 12, 2014, 07:48:23 AM
I have added methods to create MD5, SHA1 and 32 bit CRC checksums for files to the scripting language. Especially the SHA1 checksum is frequently used to verify validity of files and archives. I use SHA1 checksums for all files you can download from the customerWeb site, for example. . . .
Super! This will be very useful.  8)
Over the hill, and enjoying the glide.