200'000 images in IM5

Started by sinus, April 17, 2015, 03:37:02 PM

Previous topic - Next topic

sinus

Hi
Not important, just let you know:

I have at the moment 200'348 files, managed by IMatch.

There are mostly nefs and jpg, but also files with the format:

tif
psd
gif
bmp
doc
indd
mov
txt
mp3
sla
stv

and so on...

And IMatch does work very good, no problems!
When I have over 300'000, I let you know again!  ;D
Best wishes from Switzerland! :-)
Markus

Mario

Sounds great. Thanks for the feedback  :)

stonecherub

What would you do with 200'000 images if you didn't have IMatch? Just asking.

jeknepley

255,451 here, but I plan to reduce that number by getting rid of files that are duplicates, unneeded versions, etc.

IM5 handles them really well (once Mario got further along in the releases). No complaints - just kudos.

JohnZeman

Quote from: stonecherub on April 17, 2015, 08:36:49 PM
What would you do with 200'000 images if you didn't have IMatch? Just asking.

Excellent question.  I have asked a slightly different version of it in different photography forums I belong to, forums that consists of mostly amateurs, and their answers always amuse me.  For the most part their answers range from some variation of "I don't know" to "I just hope to get lucky."

When I hear that I always say using IMatch I can find any image or combination of images I want from my 50,000+ image database in about 5 seconds or less.  One would think that would prompt at least some of the others to ask how I can do that, but there is always very little response.  I think that's because most of them are as intimidated by their computer as they are with their camera gear.

unterwasserfoto_at

@SINUS
Respect!

any years ago, you have list on yours side the workflow

Can you tell us how the workflow is now work. Its interessting by 200.000 images

Best Tom
Berufsfotograf in Ă–sterreich.
HW: Nikon D5, Nikon Coolscan 5000, Subal Unterwassergehäuse, DJI Phantom
SW Photoshop CS4, Adobe Bridge, IMatch6

Carlo Didier

And I thought I had a already a lot with 82000+ ...
And I still haven't scanned all my old slides and practically none of my wifes' negatives!

Mario

QuoteAnd I thought I had a already a lot with 82000+ ...

I think the largest IMatch 5 DB now has about 360,000 files. There was one user who had 500,000 files in IMatch 3 and who complained about fuzzy issues with IMatch 5. But he never replied to my questions or provided a log file so I don't know if he succeeded adding 500,000 files to IMatch 5.

500,000 files is an enterprise-sized image database. The competing vendors like Canto, FotoWare, Wise etc. build systems managing this amount of files using dedicated servers, server licenses, consulting, maintenance contracts or just pull everything in their cloud infrastructures. Needless to say that you won't get this for the 100 bucks IMatch costs. You don't even get a single hour of consulting for only 100 bucks... ;)

sinus

Quote from: stonecherub on April 17, 2015, 08:36:49 PM
What would you do with 200'000 images if you didn't have IMatch? Just asking.

Hmmm, I do not know, really. I had before (looong ago) Portfolio, it was good, but not very good and IMatch is simply outstanding good!
Best wishes from Switzerland! :-)
Markus

sinus

Quote from: jeknepley on April 18, 2015, 12:21:55 AM
255,451 here, but I plan to reduce that number by getting rid of files that are duplicates, unneeded versions, etc.

IM5 handles them really well (once Mario got further along in the releases). No complaints - just kudos.

I did the same: duplicates I have eliminated.

What is really a good question: unneeded files.

If I have a nef (raw) and a jpg, does it makes sense, to hold both of them?
And further: if I take 5 pictures from the same thing (say a group of people), and I have found the best ... does it makes still sense, to hold the other 4 (not needed) images?

Really a simple quesetion, but with not an easy answer, I am afraid.
Best wishes from Switzerland! :-)
Markus

zematima

Hi:
What is the size of your database?
I only have about 24000 and the size is 1 779 244KB.(1,8Gb)
Thanks.

Carlo Didier

83132 images, 3.8GB database with 300px thumbnails

jeknepley

My 255,451 files weigh in at 9.05GB (Windows Explorer, but 8.63GB in the IM5 Info & Activity panel)

Included are 3,258 MP3 and 653 PDF files

Mees Dekker

Just over 46.000 images and a database of 1.755.524 kB

JohnZeman

Quote from: sinus on April 19, 2015, 06:59:43 PM

What is really a good question: unneeded files.

If I have a nef (raw) and a jpg, does it makes sense, to hold both of them?
And further: if I take 5 pictures from the same thing (say a group of people), and I have found the best ... does it makes still sense, to hold the other 4 (not needed) images?

Really a simple quesetion, but with not an easy answer, I am afraid.

About 5 or 6 years ago after thinking about this same thing, I made the decision to split my originals and final images between Lightroom and IMatch and I've never regretted it.  For me the originals in Lightroom only have one purpose, that is to be reprocessed again should I ever need to do so and that happens quite often.  However other than that the important images to me are in IMatch.

So each year as I add about 5000 new images, 5000 that soon become 10,000 or more as they are processed, keeping them separate in two different programs is actually far simpler for me than trying to do it in IMatch alone.

Ger

Quote from: stonecherub on April 17, 2015, 08:36:49 PM
What would you do with 200'000 images if you didn't have IMatch? Just asking.

Hmmm... maybe better to turn the question around? Why do you need photography if you have a database with 200.000 images and IMatch to play with?  ;)

sinus

Quote from: Ger on April 21, 2015, 07:58:47 AM
Quote from: stonecherub on April 17, 2015, 08:36:49 PM
What would you do with 200'000 images if you didn't have IMatch? Just asking.

Hmmm... maybe better to turn the question around? Why do you need photography if you have a database with 200.000 images and IMatch to play with?  ;)

If I take my slides and black-and-white prints also into thoughts, than I would have for sure over 7 millions of images. I had and have a photo-agency, hence there are a lot of images over years and years.  ;D

But unfortunately most of them are not scanned, so I will not have the chance, I guess, to test IMatch with 7 millions of images.  ;D ;D ;D
Best wishes from Switzerland! :-)
Markus

sinus

John, your solution is fine also, of course. But I love to see masters and versions neatly side-by-side or in a version stack (except the fact, that the master is on top in a version stack, but this is another thing).

So if you do not regrett your decision, then it was a good decision.
Best wishes from Switzerland! :-)
Markus

Ferdinand

John's solution might work if you only use LR as a converter, but I use at least four.

I've currently got 95,000 images, only a little over half of them are originals (I aim for quality over quantity, and sometimes I even succeed!).  (I use the term "originals" rather than "masters", as there are many without versions, and so IMatch doesn't regard them as masters.)   I did a cull a while back of my rejected originals, to save space and reduce the DB size. 

My question: is how many versions of an image do I keep?  I often create proofs for a range of purposes.  At some stage I'm going to have to do a cull of these, since many of them are no longer needed, or could be regenerated quickly if required. 

Markus - rest assured I'm not chasing you!

sinus

Quote from: Ferdinand on April 21, 2015, 09:06:03 AM
John's solution might work if you only use LR as a converter, but I use at least four.
Phew, I see, I wonder, I could not, it's like I would have troubles to use 2 or even more different cameras (Nikon, Canon ...).

Quote from: Ferdinand on April 21, 2015, 09:06:03 AM
I've currently got 95,000 images, only a little over half of them are originals (I aim for quality over quantity, and sometimes I even succeed!).  (I use the term "originals" rather than "masters", as there are many without versions, and so IMatch doesn't regard them as masters.)   

I see all files, when they are in IMatch, as originals. If you send me a copy of an image of you, for me this is an original.
When IMatch detects a version, well, then I have an original (master) AND a version or several versions.

And about quality over quantity: yep, if you have not clients, who wants quantity, and this is very often the case here.

Quote from: Ferdinand on April 21, 2015, 09:06:03 AM
My question: is how many versions of an image do I keep? 

If I think, I will never more use an image, then I tend to delete the master or the versions. So just in case I am not lost, because I would have still the master or the copy.
If I have a version with a lot of editing done, I tend to not delete this image.

Quote from: Ferdinand on April 21, 2015, 09:06:03 AM
Markus - rest assured I'm not chasing you!

;D  :)
Best wishes from Switzerland! :-)
Markus

sinus

Quote from: zematima on April 20, 2015, 05:22:54 PM
Hi:
What is the size of your database?
I only have about 24000 and the size is 1 779 244KB.(1,8Gb)
Thanks.

My 200'000 images has a db-size of 10 GB. Once loaded, I have no problems.
I think, I will not split it before I have troubles.

But as it looks now, I can have without troubles 300'000 and more. Not to forget, hardware will be also quicker and quicker, space cheaper and cheaper.
Best wishes from Switzerland! :-)
Markus

Mario

Can you send me a log file (debug mode) from a normal start / close cycle? I'm interested in some of the stats.
10 GB is quite a lot. The database file size on disk does not impact performance though. 70% to 80% of the database size are thumbnails/metadata anyway. Even 300 pixel thumbnails eat up a lot of space when there are 200,000 of them.

sinus

Quote from: Mario on April 21, 2015, 11:19:24 AM
Can you send me a log file (debug mode) from a normal start / close cycle? I'm interested in some of the stats.
10 GB is quite a lot. The database file size on disk does not impact performance though. 70% to 80% of the database size are thumbnails/metadata anyway. Even 300 pixel thumbnails eat up a lot of space when there are 200,000 of them.

Of course, Mario.
But first I have to work with IMatch, until 3 pm I must editing products-shots - and for this I use IMatch and Photoshop!
What would I do without IMatch? Phew, I am really glad, that I did about 2001 "set on the right horse" - on Mario!  ;D ;D ;D (I do not know, what the correct phrase it for this, and not time to look up ... in German I mean "auf das richtige Pferd setzen")
Best wishes from Switzerland! :-)
Markus

Ferdinand

I'm glad that I "backed the right horse"?  ("back" as in to gamble on / to place a bet on)

sinus

Quote from: Mario on April 21, 2015, 11:19:24 AM
Can you send me a log file (debug mode) from a normal start / close cycle? I'm interested in some of the stats.
10 GB is quite a lot. The database file size on disk does not impact performance though. 70% to 80% of the database size are thumbnails/metadata anyway. Even 300 pixel thumbnails eat up a lot of space when there are 200,000 of them.

Here is the log, what you wished to see.
Hope, it helps.

BTW:  I am not sure, in the diagnostics or pack'n'go you have an info, something like:

"the last time this step took 55 minutes".

Such an informations I think, is very helpful. When I had troubles with my collections (now NOT more), I created a kind of script for me, simply to control some stuff and this gave me safety, for my feelings.
I can with these informations see very quickly, if all my collections and files are still ok. Helped me really a lot  :D
(see also in the attachement).

If I have time, I will try to write an app, what does show me the most important informations from the last session and the actual session. This will give me quite a lot of security. I mean something like

--------------------------------last session-------actual session-------
Version IM5                        5.4.4                   5.4.4
time to load                        16 sec.                17 sec.
size of DB                           8.4 GB                 8.5 GB
files in the db                      14'987                15'112
number of folders                ...
number of files
number of categories
number of red pins
...
last time with IM

This would give me a quick overview about my actual IMatch-db and could prevent some unforseen troubles.
Of course the info and other scripts gives quite a lot of informations, but the interesting point would be a comparision between the last time and the actual time.

Maybe it would be also interesting, because users could compare some informations (like here, although it is only the number of files and the size of the database, this can be for users interesting).

But I am afraid, to create such an App, I first must lern a lot (JavaScript) and so on ...  :D (your links some posts above about JavaScript are great!)









[attachment deleted by admin]
Best wishes from Switzerland! :-)
Markus

sinus

Quote from: Ferdinand on April 21, 2015, 02:05:13 PM
I'm glad that I "backed the right horse"?  ("back" as in to gamble on / to place a bet on)

Thanks, Ferdinand, this is a good one.
Best wishes from Switzerland! :-)
Markus

Mario

#26
The log looks good.
The only really slow operation is loading the database (slow meaning: takes longer than 5 seconds), which is normal.
I don't know if you use a SSD yet.

A log file from a normal IMatch session (where you actually work for some hours) would also be useful, because it contains statistics for things like category and collection updates, file window performance etc. You can send it to me if you like.

sinus

#27
Quote from: Mario on April 21, 2015, 04:49:41 PM
The log looks good.
The only really slow operation is loading the database (slow meaning: takes longer than 5 seconds), which is normal.
I don't know if you use a SSD yet.
Thanks, das beruhigt mich.
No, I do not use a SSD, I think, my computer is not a very quick one. If I would use a SSD, so it would quicker.
But I am happy now, like IMatch works, really.

Quote from: Mario on April 21, 2015, 04:49:41 PM
A log file from a normal IMatch session (where you actually work for some hours) would also be useful, because it contains statistics for things like category and collection updates, file window performance etc. You can send it to me if you like.

Yep, I will do, thanks!
Best wishes from Switzerland! :-)
Markus

sinus

Quote from: Mario on April 21, 2015, 04:49:41 PM
A log file from a normal IMatch session (where you actually work for some hours) would also be useful, because it contains statistics for things like category and collection updates, file window performance etc. You can send it to me if you like.

Hey Mario,
here is another log, this time IMatch was running several hours. Hope, this helps you.


[attachment deleted by admin]
Best wishes from Switzerland! :-)
Markus

Mario

Looks good. Smooth run.
2 GB RAM used in peak (Viewer), also normal.
I've took some notes for potential improvements for future versions.

sinus

Quote from: Mario on April 23, 2015, 08:24:14 PM
Looks good. Smooth run.
2 GB RAM used in peak (Viewer), also normal.
I've took some notes for potential improvements for future versions.

Thanks, Mario,

your comment gives me a good feeling!  :)

As I said before, IMatch runs now really good for me, no crashes, quick enough, simply the best.
Best wishes from Switzerland! :-)
Markus

Hobbyfotograf

Hi Mario

I have 261'000 files in the database, mostly because I always shoot RAW + JPEG and don't like to delete pictures... Works without any troubles now. But I found out a few important things to run it smoothly:

- Alwas compact and optimize after adding pictures
- Never substantially more than 2000 images in one folder

I have my database on a very fast SSD.

File size: 12.23 GB
Start up time: around 15 seconds

Mario

#32
Adding a thousands files can easily add a million records to the database - depending on how much metadata your files contain, and how you configure the Tag Manager (keeping out unneeded tags greatly reduces the database size and keeps up performance).

QuoteNever substantially more than 2000 images in one folder

I test with folders up to 10,000 files. Windows becomes slower when you add more than a few hundred files in a folder, especially when you run the Windows search index and other gotchas. And of course it makes a difference whether IMatch has to check 500 files every time Windows reports a change in the folder, or 5000 files.

And of course features like file relations, thesaurus lookups to match keywords of imported files, metadata templates and other sophisticated features in IMatch may reduce performance when adding files, or causing millions of database and file system lookup operations.

Hobbyfotograf

Well, my biggest folder also has around 10'000 files. It does work, but everything is a lot slower. The main problem with big folders isn't IMatch anyway, but GeoSetter.
I've got a lot of file relations, all JPEG and RAW are related, all JPEG and HDR are related. That means, of the 26000 files, 130'000 are masters.
But I'm not complaining at all! IMatch works very fast. I just optimize it as often as possible.

I will have a closer look at the tag manager, as soon as I find time.

sinus

Quote from: Hobbyfotograf on April 30, 2015, 10:12:40 AMThat means, of the 26000 files, 130'000 are masters.

You have such a lot of images, that you forgot a 0, I guess, you mean 260'000 files.  ;D

I have roughly about 2'500 files in a folder, the biggest are about 6'000 files (I work with monthly folders).
Best wishes from Switzerland! :-)
Markus

Tallpics

I know this thread wasn't started so that we can all boast how 'mine is bigger than yours' so I am only posting this for yours and Mario's information :-)

I am a pro sports and gig photographer meaning I take more pics than many photogs. You may not believe that when you see the numbers below!!

I actually edit quite ruthlessly. However as I save an edited 'finished' JPG version of most NEF files this nearly doubles the number of files in my database. That's my excuse :-)

Firstly I would like to say that recent versions of IMatch5 have proved very stable and the speed is very good indeed. The interface and the depth of information that can be searched and sorted is truly amazing. Well done Mario :-)

So to the approximate numbers for my database:

Database size: 27GB
Number of files (mostly NEF, JPG, PSD and TIFF): 340,000!
Number of Categories: 5,000
Number of folders: 200
Cache size: 340GB

Here are a few more details:

I have IMatch5 set to make a full size cache version of every image (including JPG's). This allows be to take a 'complete' version of the database including the cache to clients on a laptop allowing them to view pics full-screen and even zoom to 100%. This obviously explains the huge cache size.

I also push the boundaries by setting IMatch5 to render 600 x 600 pixel thumbnails. The resulting 'thumbnail view' in the database is superb and still allows me to scroll through folders with very little lag. I found the default size (200 pixel I think) too small on a large screen. I had also been spoilt by Lightroom's larger thumbnails.

My desktop computer is several years old - Intel i7 CPU 975 @ 3.33ghz with 12GB RAM.

I have upgraded the desktop system disk to a Samsung 500GB Pro SSD. The IMatch5 database and Cache are also stored on a separate Samsung 1TB Pro SSD.

The laptop is about the same age and has been upgraded with two SSD's.

Mario has rightly suggested that IMatch5 was never designed to handle such big numbers. This sort of work should be handed over to 'specialist' programs costing many, many times more. But I have 'grown-up' with IMatch5 and love the latest version and trust Mario to respond to any problems.

I do have a plan... should it be needed when my database finally grows beyond what IMatch5 can handle. I will split the Sports and Gigs into two new databases - so halving their size. Files are currently stored in separate Sports or Gigs folders. I've already checked with Mario if this possible and there is a procedure that will enable this :-)

I hope this comment enlightens other users and gives you confidence that, in this digital age of taking so many pics, IMatch5 should see you well into the future.

I read earlier in this thread that Mario likes to see an occasional LOG file from larger databases so I have attached a recent one after a session adding files and compacting the database. I hope it is informative.

Thanks Mario for your great work on IMatch5

[attachment deleted by admin]

Mario

Thanks for the valuable feedback.
340,000 files - gosh.
I can almost hear the guys in the big competitors blushing...I wonder what Widen, Canto et. all. would suggest (hardware, software, licenses, training, consulting) for an archive of that size. I consider it an easy bet that it would cost somewhat more than 100 US$.

Unless you have already, please consider writing a short review at Capterra (https://www.photools.com/community/index.php?topic=4512.0) about IMatch. Telling potential users about such a massive data volume processed in a professional environment should give them some important insight.

sinus

Hi Tallpics

wow, cool, your review here is really for me very valuable! (would be also interesting for potential users on Capterra, would be nice of you, if you could do a review there).

I do it like you: I trust Mario!  ;D
And I thougth also the same: when my DB would be too slow, I will split the db simply into two. But I think, this will not happens so quickly ... when I see your 340'000 images, phew!

Thanks for sharing your system!
Best wishes from Switzerland! :-)
Markus

Photon

#38
During running PC over night I sucessfully imported 210000 JPG photos with 15000 MPG/WMV videos and about 25000 various other office/multimedia files.
The database file *.IMD5 has a size of 6750 MByte. Up to now everything works fine!

I do not know how to check how long the import really took. In the moring there was a pop-up-window on the PC with message:
"WIC codec required... Withou the WIC coded Imatch, IMatch may be unable to create thumbnails... Name of the file: ...drive\path\*.flv"
In the windows was a tick box "Don't show again" and the "Ok" Button.
Some hours ago, there was another IMatch pop-up of my ffdshow application. This well-known pop-up offers options for applications and video files, which are not yet white- or blacklisted in ffdshow.

I assume that the pop-ups did not block the import and the import still continues in the background. Is this the case?

Regards, Martin


| IMatch v5.5.8 + Win7proN64bit | Lumix, Pentax |
| ExifTool, ImageMagick, GeoSetter | JPhotoTagger, MusicBee | CaptureOne, LightRoom | jAlbum, WingsPlatinum, Mobjects |

herman

Quote from: Photon on May 08, 2015, 02:54:14 PM
The database file *.IMD5 has a size of 6750 GByte. Up to now everything works fine!

Over 6 terabytes?
Really?  :o
Enjoy!

Herman.

Mario

When IMatch ingests a file for which it requires by default a WIC codec, you see this message. This message does not pause the import.

If a WIC codec, or ffdshow (which is apparently used by Windows when IMatch requests a thumbnail for a video file on your systm) display a message box, this may halt the import process. I don't know anything about ffdshow (I use Kodi and VLC) but maybe you can turn this message off. But changes are that you won't see this anymore because IMatch has imported all files.

Photon

Quote from: herman on May 08, 2015, 03:06:58 PM
Quote from: Photon on May 08, 2015, 02:54:14 PM
The database file *.IMD5 has a size of 6750 GByte. Up to now everything works fine!
Over 6 terabytes? Really?  :o

Oops! You are right Herman, was my typo and is now corrected. TByte is only nearby and might be reached during the next weeks. :-)
The corresponding database file is really 6,720,868 kByte which gives 6.409519196 GByte.
"," is the thousand separator and "." the decimal point. In Germany it is defined in the opposite way.

Regards
| IMatch v5.5.8 + Win7proN64bit | Lumix, Pentax |
| ExifTool, ImageMagick, GeoSetter | JPhotoTagger, MusicBee | CaptureOne, LightRoom | jAlbum, WingsPlatinum, Mobjects |