Pre-purchase questions for empowering two iMatch users

Started by MrPete, August 23, 2023, 06:14:28 PM

Previous topic - Next topic

MrPete

NOTE: This is intended to be all about one overall topic: empowering two iMatch users. If it's too much for one post/discussion, I can break it up....

Background
* I'm not yet an iMatch user. Looking for a solution for our context.
* It's just me (Pete) and my sweetie (Leslie)... eventually yes I'd love to give extended family/friends/etc access to portions of our media library. But that's not crucial.

She: semi-pro nature photographer.
* Current workflow: Nikon (fave: D500+200-500 zoom) RAW+JPG -> DXO -> (some pro printing, some online posting)
* Keeps her photos (~350,000) in a scientific-order tree (Class-Order-Family-Genus-Species). Yep, she actually thinks that way :) ... but also needs date and location sometimes. (Location for Birds is usually by County in the USA.)
* She also does some people/event photos.

Me: mostly tech support, plus I'm the designated extended-family archivist, converting LOTS of slides, photos, film, video, etc to digital. I'm ex-SiValley guy, a nonprofit serving nonprofits worldwide for 30 years.
* I have a nice (Samsung Note 20 Ultra) phone cam when I need to take pix (why compete with sweetie ;) )
* My workflow involves various capture methods, AVIdemux scripts, NeatImage/Video, Picasa (hoping iMatch will take over the facial recognition), and some Adobe. Under 200k files.
* I mostly do stuff with people and events, plus a little other stuff. My library is organized by date; I also sometimes need location and category.

Hardware: we are pretty well set.
* She needs frequent upgrades; currently shooting 5 x 25MB (RAW+JPG) per second stills ;)
* We each have many TB of storage. Backup is RAID6 NAS, 5x10TB.
* A very nice friend recently set me up with a pair of VERY nice (ProxMox) VM hosts. (His perspective: I'm "only" at 1GB networking, with ability to go to 10GB. He runs 100GB Mellanox at home ;) )

Relevant Purpose

Looking at IM and IMA, I think these are the relevant "What we do and how it relates to IM/IMA" items:
* AFAIK, IM is essentially single user with a single-user database. IMA allows multi-user metadata editing but no add/removal of actual media files.
* Neither one of us processes media full time, although our bursts of effort can certainly overlap. We both definitely add/remove media files.
* Clearly, we have two different methods for organizing our actual media. (Having discussed this, she cares a LOT about her on-disk organization. I don't care that much as long as the software lets me find what I need.)

Bottom line: we would very much like a (set of) tools that help us better manage our media library, keep it effectively organized, searchable, stackable, and shareable. And not cost an arm and a leg.

QUESTIONS

A. Related to the challenge of two people both adding media files.

1. Does IM support (now or planned or dreamed ;) ) any form of database export/import for its OWN needs (eg effective backup/restore, db cleaning, etc)?
- If so, presumably I could write code to transfer/merge media collections
- (I am not talking about pack-and-go ;) )

2. Is there any guidance already on best/workable ways to manage IM/IMA plus two media stores?
- (I assume below that IMA is used when/where appropriate)
- Below are a few variants I can imagine...

a) Remote IM DB (I've seen discussion here that IM/ExifTool is painful with a remote media library, so maybe this is bad?)
- User/Computer L with local 5TB of media
- User/Computer P with local 5TB of media
- (Possibly) VM on LAN with local iMatch database, pointing to computers L and P

b) Pick a local IM DB
- User/Computer L with local 5TB of media and iMatch
- User/Computer P with local 5TB of media, referenced by L computer's iMatch

c) Two IM DB's, occasional xfer/merge (if possible)
- User/Computer L with local 5TB of media and iMatch
- User/Computer P with local 5TB of media and iMatch
- (Perhaps: we just cooperate. non-nature media get added to P computer; nature stuff on L computer; keep separate)

d) One hosted Windows VM for IM
- Move the whole shebang to VM on capable host (with nVidia graphics).
- Access iMatch/IMA over the LAN
- Experiment with even running workflow on VM or downloading locally for edits.
- Either take turns using IM, or have two IM's on the VM host.

2a) Related: I've seen a discussion where Mario talks about firing up IM on remote (Azure) VM's for development. Are there happy users with significant media libraries all running on a remote VM host, perhaps on-LAN? (I use ProxMox. OwnCloud and other software emulates internet cloud services...)

B. Related to iMatch / IMA capabilities

3. Is there a comparison of metadata editing capabilities of IMA vs IM? Or are they essentially identical?

4. Is there any issue with iMatch managing two media libraries, with completely different physical folder organization? (ie scientific vs date)

5. What other questions SHOULD I be asking? :)


These are probably beyond the scope of this topic... but I'll put them here for now:

6. Anyone tried auto-tagging check pix we take before depositing at the bank? ie recognize one or more rectangular bits of paper ;) 

7. I know iMatch can import Picasa (XMP) face tags. Trickier challenge: Picasa has some bugs that cause face tag locations to degrade.
  Question: Is iMatch able to re-recognize faces, and take advantage of the imported naming? (I'm thinking: as long as the iMatch rectangle is ~~ same as Picasa rectangle, then accept it as a good face/name. Then in another photo, if NOT a good match, it can still find the face and give a good name ;) )

Mario

1. No.
IMatch allows you to import and export categories, the thesaurus and Attributes.
Virtually all database contents can be exported to text and are accessible via the built-in app ecosystem in IMatch.

2. What do you mean by media store?
IMatch can virtually handle any number of drives and UNC shares per database.

3. It's not painful.
Just a lot slower than working with files on local store (local meaning: disks on the same computer IMatch is running). IMatch can read images and videos 1000 times faster from a local SSD than over a network. Physics.
The same, even more, is true for storing metadata in files. When writing, ExifTool produces a copy of the original file, with the modified metadata. And only when this works out 100% correct, it deletes the original and renames the copy. The copy is created in the same folder as the original image. And if all that happens over a network, the penalty of slow network traffic is doubled.

Also, if you don't use pro-grade store, but affordable NAS boxes: they run some sort of Linux which runs some sort of SAMBA which simulates a Windows server file system. All these extra layers can have bugs or issues, which often manifest only under heavy load, which IMatch is able to produce when processing 5, 10 or 20 files at the same time....
See IMatch and NAS Systems (Network-Attached Storage) for more info.

General guidance: Database on the local SSD, images on the local disk until finalized (all metadata set). Then move images to network storage when desired.

3. IMatch allows for a lot more metadata editing. IMatch Anywhere WebViewer is designed for users who don't have DAM experience (commercial customers, mostly) and the File Lens allows for an easy access and editing to common metadata tags, keywords and categories - with detailed access control by the IMA Admin.
Tip: Install the free IMatch Anywhere Trial version and see for yourself.

4. IMatch can manage all files and folders accessible for Windows.
I still don't know what you mean by media library (see 1).
In almost all cases, it does not make sense to work with multiple IMatch databases.

6. The IMatch AI model has not been trained on checks or other documents.

7. Picasa produced crappy XMP metadata in different flavors over time. But it was free, so people did not complain. Or when they did, Google just ignored them.

IMatch can only trust existing XMP face regions. It loads them and performs face recognition within the region.
There is a special case for Apple's zero site XMP face regions (many Apple devices write/wrote face region with a width and height of zero.

See Working with XMP Face Regions for more related information.


-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

MrPete

THANK YOU! Very helpful.
A few quick bits of further discussion:

Quote from: Mario on August 24, 2023, 10:39:53 AM1. No.
IMatch allows you to import and export categories, the thesaurus and Attributes.
Virtually all database contents can be exported to text and are accessible via the built-in app ecosystem in IMatch.
So, sounds like full export is not a problem ("virtually all database contents"), just full import. It is what it is  :)

Quote2. What do you mean by media store?
I'm talking about two media collections. And... thinking about the fact that our two collections are organized very differently. Sounds like that's simply not an issue for iMatch.

Quote2a. It's not painful...General guidance: Database on the local SSD, images on the local disk until finalized (all metadata set). Then move images to network storage when desired.
That sounds VERY interesting. I've now read up on file/folder moves in iMatch. Is the following an existing feature I missed?
  • Have a preference for Local High Speed Working Folder [ie a local folder with plenty of space and performance. I have a RAID 0 SSD pair for this purpose. Some people have NVMe on motherboard, but not normally enough space on that to hold their whole library.]
  • In folder management, have the ability to mark a folder (tree) as "Active" or some such. This causes that folder to temporarily be copied to the LHSW folder, and all activity takes place there. Could allow "Apply" (copy current state back to slower storage) and "Deactivate" (done. copy back and delete from LHSW.)

I vehemently agree: it's 100% true that local high speed working storage is incredibly better in almost any situation. Even with SSD's... my RAID0 SSD pair is twice as fast as a normal SSD. And "infinitely" (hah) better than cross-LAN storage.

Quote6. The IMatch AI model has not been trained on checks or other documents.
I may be able to provide some help (long term) on improving and/or adding flexibility to the training. See offline email coming your way soon. ;)


Quote7. Picasa ... IMatch can only trust existing XMP face regions...There is a special case for Apple... Apple devices write/wrote face region with a width and height of zero.
So, if I modify the sidecar to set regions to width/height zero, it just might do the recalculation needed? ;)

Always happy to abuse software to do what I want.  ;D

Mario

QuoteSo, if I modify the sidecar to set regions to width/height zero, it just might do the recalculation needed?
I don't understand. Why would you meddle with XMP regions manually?
If Picasa has created invalid face annotations for some of your files, just delete and re-create them in IMatch for affected images. IMatch re-creates the XMP regions correctly during write-back.


QuoteIn folder management, have the ability to mark a folder (tree) as "Active" or some such. This causes that folder to temporarily be copied to the LHSW folder, and all activity takes place there. 
This sounds way to specialized and niche.

For more than decade, IMatch users have been easily able to do that without special features or complex words like LHSW.
It's a simple workflow: you keep your images and videos on your local system until you are finished editing and processing them. Then you move them to long-term archival storage.
Problem solved.
The performance penalty of NAS only affects the final copy operation, when you archive the files.

IMatch keeps cache files locally and the metadata is cached in the database. Except for file system operations like "Does this folder exist" or "Does this file exist" which IMatch performs to indicate off-line files (and caches in memory for some time), the performance of the long-term storage system does not matter.

If you have to re-process a large chunk of archived files (which should be really rare) and you find that your NAS is too slow to work efficiently, copy the files from the NAS to a local folder, edit them, reprocess metadata etc. and finally move them back to the NAS when the work is done. This is easy to do with IMatch in the Media & Folders View.

If you follow the advice given in IMatch and NAS Systems (Network-Attached Storage) and reduce the number of parallel processing threads, performance for processing files stored on NAS systems is usually tolerable.
Way slower than processing files from a local disk, but tolerable.

Keeping the IMatch database on the fastest disk/SSD on the computer running IMatch is very important, though.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Tveloso

Quote from: Mario on August 28, 2023, 10:46:44 AMIt's a simple workflow: you keep your images and videos on your local system until you are finished editing and processing them. Then you move them to long-term archival storage.
Problem solved.
You should have a look at The Renamer Help Topic:

https://www.photools.com/help/imatch/ren_basics.htm#

...which can be an integral part of the workflow Mario describes here.  I regularly use two Renamer Presets ("Store Photos" and "Store Videos") that handle moving the files to the archival storage (and to name them per my naming convention)...and a few others to "clean" some Filenames in the local Ingest Folders (to trigger certain Buddy Relations).
--Tony

MrPete

@Mario, the workflow you're describing is great as long as the user is processing new files.

Consider however, working through the cleanup and processing of a hundred thousand existing files.

Is it really that rare for someone to need to work through a big pile of already-existing digital data?

(Oh, and you're 100% correct about the face recognition: just redo as needed. It's been sooo long since I had a reliable system for that :) )

Jingo

Quote from: MrPete on August 28, 2023, 01:36:22 PM@Mario, the workflow you're describing is great as long as the user is processing new files.

Consider however, working through the cleanup and processing of a hundred thousand existing files.

Is it really that rare for someone to need to work through a big pile of already-existing digital data?

Not rare at all... what I did in this case: I loaded the NAS drive locally, processed the files, then moved the drive back to the NAS.  The speed gains are 100% worth having the data locally attached rather than going across a network.  

For smaller chunks of files, I follow Mario's advice and just copy the files back and forth during "off-hours" using a fast copy tool like FastCopy.

MrPete

Quote from: Jingo on August 28, 2023, 02:01:38 PMI loaded the NAS drive locally, processed the files, then moved the drive back to the NAS. 
A little tricky with RAID 6... but I totally agree on the performance of local.

I'll look into some of the newer ways to improve direct local access.

My other thought is to operate in a network-accessible VM. Haven't heard any response to that part of my question above. Sounds like I get to do some research across the board. Such fun!  8)