Slow import times - Why are 2 imports done on the addition of one directory?

Started by ChrisPlak, May 24, 2016, 08:53:38 AM

Previous topic - Next topic

ChrisPlak

I am running on a fast system.  4790K cpu, 32gigs ram, 500gb Evo 950 SSD, no anti virus software, W10H.  All software, pictures, and database are on the SSD.  Nothing else is running on the system.

My imatch config is as follows:  My thread values for file and metadata are set to 0.  I am not assigning any categories or templates.  I do not have any buddy or version file rules.  I am not writing metadata back.  My existing, pre import database, has approximately 13,000 photos in it.  It is basically a new db that had one previous import done on it.


I am importing 14,000 photos.  A mixture of JPEGS and DNG files.  From what I can see there seem to be 3 phases that IMatch goes through,
(Please note, I assigned arbitrary phases to what I am seeing.  There very likely are other actions going on that I'm missing, but for my questions/issues that is irrelevant
)
Phase 1 - Identify the folders and pictures on the harddrive.  This takes approximately 45 seconds.  Very impressive.
Phase 2 - The "Importing Metadata" window pops up imports occur and take approximately 20-25 minutes to completely finish.  Not too bad.
Phase 3 - The "Importing Metadata" window runs a 2nd time.  This 2nd run of the "Importing Metadata" takes a minimum of 5 hours, usually up to 7 or 8 hours


My questions are
1) Why does phase 3 happen?  IE Why does the 2nd "Importing Metadata" occur?
2) Is there any way to speed up phase 3?
3) What is going on in phase 3?
4) In the attached photo during phase 3, why are so many of the "bars" at a lower level?  Whereas the phase 2 has much higher "Processed per units"


I've attached two photos showing the progress. 
The photo of Phase 2 shows how quickly files are being processed.  This screenshot was taken after running for only about 10 minutes.
The photo of Phase 3 was captured after 45 minutes or so of running.  Why








Mario

IMatch first identifies the files to add/update.
It then runs a first pass, importing metadata via ExifTool. Because the metadata may affect pass 2.
Pass 2 is reading the actual image files, producing thumbnails, cache files, visual query data, file checksums etc.

14,000 images is not much.
Should take less than an hour, assuming a typical 50 to 200 files per minute.

You did not add a log file which would tell us exactly what takes how long.

Please force-rescan a folder and attach a log file.
Also include information about which DNG codec you have installed etc.
This seems to be the culprit. Do you create cache files on-demand (default) or in advance?
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Carlo Didier

Quote from: ChrisPlak on May 24, 2016, 08:53:38 AM... no anti virus software, W10H.
I hope you have at least Windows Defender running (even if it's still far behind other, even free, AV tools)!

ChrisPlak

Quote from: Mario on May 24, 2016, 09:44:15 AM
Please force-rescan a folder and attach a log file.
I'm running the rescan now and will upload the log when I get home. 

Quote from: Mario on May 24, 2016, 09:44:15 AM
Do you create cache files on-demand (default) or in advance?
I disabled offline-cache completely.  This imatch database is used only for renaming files and making a first pass through looking for rejects.


Quote from: Carlo Didier on May 24, 2016, 10:25:36 AM
I hope you have at least Windows Defender running (even if it's still far behind other, even free, AV tools)!
I normally always do and I put certain directories, such as my pictures and IMatch directories on the exlusion list.  But since I appear to be having a performance problem I wanted to rule out completely anti virus, and thus I disabled it.

ChrisPlak

I've been trying to track this down and I believe the culprit are my dng codecs.  It appears something is incorrect. 

(Note, I have an email off to FPV support.  I do have a direct question about IM though).

Which of the 3 do you use for the offline cache?  The Help file leads me to believe it is the "Preview" entry, but I'm hoping you use the "Full Resolution" entry.

   Thumbnail: Codec 'Adobe DNG Decoder'
      () 256x171 pixel in 289 ms.
   Preview: Codec 'Adobe DNG Decoder'
      () 256x171 pixel in 289 ms.
   Full resolution: Codec 'Adobe DNG Decoder'
      () 4368x2912 pixel in 289 ms.




I am using DNGS with the FPV Codec Pack.  ACR 9.51.  Outputting at 7.1.  Full size previews.  Not using Fast Load Data.

With FPC Codec I see the following
Testing file 'C:\TEMP_RENAMED\testdng\24_fullsize_nofld-2.DNG'
   Thumbnail: Codec 'Adobe DNG Decoder (FastPictureViewer Codec Pack)'
      () 256x171 pixel in 0 ms.
   Preview: Codec 'Adobe DNG Decoder (FastPictureViewer Codec Pack)'
      (GetPreview failed (88982F81 The operation is unsupported.).) 0x0 pixel in 0 ms.
   Full resolution: Codec 'Adobe DNG Decoder (FastPictureViewer Codec Pack)'
      () 4368x2912 pixel in 0 ms.

RESULT: A codec for this file format is installed and it looks like it fully supports the format. The file does not contain an embedded preview.


With the Adove DNG 2.0 Codec I see the following
Testing file 'C:\TEMP_RENAMED\testdng\24_fullsize_nofld-2.DNG'
   Thumbnail: Codec 'Adobe DNG Decoder'
      () 256x171 pixel in 289 ms.
   Preview: Codec 'Adobe DNG Decoder'
      () 256x171 pixel in 289 ms.
   Full resolution: Codec 'Adobe DNG Decoder'
      () 4368x2912 pixel in 289 ms.

RESULT: A codec for this file format is installed and it looks like it fully supports the format. The file does not contain an embedded preview.



Now, the question I have out to FPV is why does the image show as "256x171".  If anyone knows here, that would be great.

And fwiw, I did the following

exiftool -a -b -W %d%f_%t%-c.%s -preview:all 24_fullsize_nofld-2.DNG
and I got 2 files.
05/28/2016  04:16 PM         1,495,071 24_fullsize_nofld023_71_fs_nofld_JpgFromRaw.jpg   size is 4362x2912
05/28/2016  04:16 PM            67,931 24_fullsize_nofld023_71_fs_nofld_PreviewImage.jpg  pic size is 1024x683

Does anyone know where the 1024x683 size comes from?  Doesn't seem like a thumbnail nor any other size that makes sense.  (Answering my own question.   I figured out where they come from.  If I only embed a medium preview when I extract all I get one single file with a resolution of 1024x683.  If I embed a "full preview" then I'll get a 1024x683 (_PreviewImage) and a full size "JpgFromRaw".


Mario

QuoteWhich of the 3 do you use for the offline cache?
IMatch uses the embedded preview image, if it is as large or larger the configured minimum size under Edit > Preferences > Cache.
I suggest you only install one codec for each file format. Having multiple DNG codecs installed may lead to unwanted effects.

There are dozens of DNG format variants around these days. Adobe changes the DNG format whenever they need something new for one of their applications, and the formats of the DNG files produced by the various cameras and RAW processors also varies greatly. If you have a DNG file for which you think it has an embedded preview but neither the Adobe codec nor the FPC codec can extract it, contact Adobe and the FPV team and send them a sample.

-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook