Search For Duplicates Not Always Accurate

Started by Darius1968, June 22, 2021, 09:39:46 AM

Previous topic - Next topic

Darius1968

According to the IMatch Help:
QuoteMatch considers files as duplicates when the image data is identical.
And, most of the time, this claim is honored, by the search for duplicates faculty of IMatch!  However, I've found some notable exceptions:  I (mostly, so far) find that I can't trust this mode of searching, when it comes to audio/video files, and also, if I'm dealing with scans of documents (usually TIF files), then the result (almost always) is turning up results that are images of other scans, besides the actual correct one.  Is there any way I can improve accuracy? 

Mario

As documented, the visually identical search is based on a fingerprint of the image, produced by an algorithm. In 99.99% of all cases, this fingerprint will be identical for identical images.
But even a slight change to anything related to how the image is rendered will change the fingerprint. So, depending on how you process your files, your mileage can vary.
And it's of course useless for audio files and duplicates video files, because a re-encoding may produce a different fingerprint although you are unable to see a difference in the thumbnail.

Maybe try to rebuild the visual query data in your database using Database menu > Tools > Rebuild... because this algorithms have changed over time and it was recommended in the release notes at that time to rebuild the visual index to improve accuracy. Many users never read the release notes...
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook