Large Photo Collection

Started by JorgenP, February 07, 2018, 08:33:14 AM

Previous topic - Next topic

JorgenP

Hi Mario -   Testing the 30-day trial version of the product. -- So far I like it...

One question:   I have a large (about 450,000  photos and videos) collection  spanning 15 over years, and continuously growing (thousands of pictures per year) . 

Currently they are neatly organized in folders, by year and then activities within each. 

Looking at IMatch to  help catalog and manage by  assigning  keywords, and easily create virtual collections ...

I've seen all your videos, so I know it can be done for smaller collections, but
What do you recommend for a large collection / catalog  like mine?

Many thanks!
Jorgen

Mario

I don't have any special recommendations. 500,000 files is quite a lot of files (many stock photo agencies get by with less) but IMatch can handle this.

It is paramount to store your database on your fastest disk (SSD if possible).
Adding 500,000 files in one "go" will cause a massive amount of stress on the system.
This also increases the risk of IMatch, ExifTool, one of the many 3rd party image libraries or WIC codecs to show stress issues caused by tiny memory leaks or small glitches which don't show up until you process 50,000 files in one run.
In an image library that large there is also a good chance that a few (or even many) files are corrupted or have damaged metadata (depending on the software you used over the past 15 years).

Batch Process

The solution to all this is simple: Work in batches. Don't throe all your folders at once into IMatch. Add your files in batches of, say, 50,000 files at a time.
Then close IMatch, make a backup copy of your database, restart IMatch and add the next batch of 50,000 files.

Note: You can always close IMatch, even when it is currently processing files. IMatch will just continue where it stopped when you start it the next time.

Diagnosing and Handling Problems with your Files

In case IMatch crashes, your database will not be harmed. You can just restart it. But before you restart it, make a copy of the IMatch log file (IMATCH6_LOG.TXT in your TEMP folder). When you open the file in Notepad and scroll to the end, you see the file name(s) of the file(s) causing the crash in most cases. Removing these files from the folder or re-saving them in the original application usually solves the problem. Then restart IMatch and let it continue.

More info about the log file is available in the IMatch help. Search the help index for the term log file.

This is the best method to initially process such a large library. Depending on the size of your files, the file type and the computer speed, expect this process to take one to several days.

Metadata Write-back

When IMatch ingests your files it produces a high-quality, complete and standard-compliant XMP metadata record from existing metadata in your files. This record is stored in the IMatch database and used everywhere in IMatch. To make this metadata available to other applications, it has to be written back to the original image file (or XMP sidecar file for RAW files, videos, ...). This process can take a long time (estimate about 1 second per file). IMatch allows you to do this in batches, at your convenance.

JorgenP

Thank you so very much for the detailed response!

It took about 90 minutes to import and process the first 25,000 photos.

Currently the files reside inside a  dedicated USB 3.0  1TB drive --   
     I definitely need to switch to an SSD if I am going to be doing this.

Again thank you for the response and help!

Cheers!

Mario

QuoteCurrently the files reside inside a  dedicated USB 3.0  1TB drive --   
     I definitely need to switch to an SSD if I am going to be doing this.

90 minutes is not bad for 25,000 files.
The data storage used for the images is not that important. But using a SSD for the IMatch database is a massive performance boost.
Also make sure that your virus-checker has an exclusion for the entire folder containing the IMatch database.