"show all images with Peter and Susan but not Frank"

Started by DSR, May 26, 2023, 07:25:50 PM

Previous topic - Next topic

DSR

Does Imatch have the ability to show all images with 2 people exclusively? 

IOW, something like "show all images with Peter and Susan only"

I am working on a picture store littered with a timeline of picture folders, but also a ton of folders with various cross-sections of people.  For example:

Bob & Sally
Sally & Bill
Peter and Susan
Peter and Bob & Susam
Bob & Jack
Sally & Bill & Jack
Bob & Jack & Ron & Tom & Don

every combination u can imagine.  And in each folder are copies of pics from the timeline. The Picture store is ~10x what it should be if organized correctly.

Facial recognition is the least of my problems.  I need something that will allow me to do exact boolean searches (and that would include something like an 'only' modifier). 

Right now, i have a hackey way of making it happen, but I'm hoping for something less complicated.

thanks

Mario

Hm, that's indeed a bit challenging.
Assuming you have a category "Peter" with all images containing Peter.
And a category "Susan" containing images of Susan.
Both categories may also contain files showing Bob and Sally (together with either Peter or Susan or both).

When I understand you correctly, you need to find images showing

"Peter" AND "Susan" (in the same image) BUT NOT other persons

As you anticipated, this is quite tricky to do, even with IMatch's powerful categories and formulas.

The "Susan" AND "Peter" part is trivial.
This can be done in the People View by simply clicking both and using the "AND" option.

It's the "but no other persons" that is tricky.
Susan, Peter, Tom, Bob and probably 100 other people etc. all share the property that they are persons. This makes them hard to "grab" and filter. They will end up in the same data-driven category hierarchy, the same formula results etc.

A first idea to solve this would be to create a data-driven category which counts the number of persons in an image:

Image2.jpg


This produces a category with like this:
PersonsInImage
  |-- 1
  |-- 2
  |-- 3
  |-- 4
  ...

Now you can use a category formula like this:

"@Collection[Annotations|Region|People|Peter" AND "@Collection[Annotations|Region|People|Susan]" AND "PEOPLE|PersonsInImage|2"

which returns images containing Peter and Susan and 2 persons.

This is based on the assumption that you use IMatch's face recognition and people feature.

Alternative 2

If you don't want to use face recognition and people, you need to assign files to "Person" categories manually. IMatch needs to know which persons are in an image. For example, use a hierarchy like

PERSON
  |-- Susan
  |-- Peter

and combine that with a data-driven category based on a variable that counts in how many PERSON|* categories a file is.

Alternative 3

If you can program (Python, JavaScript, PowerShell, ...) you can write a script that retrieves the data from the IMatch database and then figures out what you want to know. IMatch offers a rich programming interface.

Maybe I'm overthinking this and other users have better solutions. Let's give it a day or so.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

DSR

Your thought process is similar to mine...

The problem isn't easy if you want two people only (if and only if it means no other faces in the pic).

It's less difficult if you want to people only (if and only if it means no other TAGGED faces in the pic)

A count would solve the problem for the most part (at least wrt to no other tagged faces).

If there's no count, then searching for 2 people, and then either counting the number of tags (assuming all face tags are either prefixed by a unique character so you can pick out the people - alternatively, the faces could be stored separately from other tags) would work.

One other way this could be achieved (for this particular user) would be to use their existing melange (mess) of folders and convert all the copied pics to shortcuts (i can code something up to do this pretty quickly).  this would cut the size of the picture store down greatly and would 'probably' work, but they'd need to continue to operate in this whacky manner (copies shortcuts rather than the actual pic) going forward.  It also might be more prone to break if/when they change drives or machines unless their careful about it. 

Now that said - do many users (who manage their pics locally and not on an online platform) use shortcuts to create albums?  i have no idea, but i kind of doubt it.

thanks

sinus

Sorry, I do not understand fully.

Do I understand it correctly, you speak not from one event (or user), but you have such a situation several times?
And what is roughly the number (amount) of images and are there 15 or 100 persons with names?

I guess, I cannot help, because you found one way, the same like Mario. I thought about searching, but it depends. 
Best wishes from Switzerland! :-)
Markus

DSR

i'm trying to help a relative out.  She has 3 TB of photos.  Without dupes, it's about 750 GB. She'll need to assign tags to maybe 200 people, but the main focus is closer to 50.  Finding identical pictures (many with different filenames attached to them) is easy.  Renaming each group of identical pics is a bit more challenging, but doable.  The final piece is the ability to do boolean searches on various combinations of people.  Not just finding Jack and Jill; but only Jack and Jill. A count probably solves that.  Now, that said, tag searching via exif info is time-consuming vs searching on file names, but I dont want to laden file names with tags.  Tags stored in a DB is probably a better if not more fragile solution.  But to get going, if searches arent prohibitively slow, it's a small price to pay for a much more organized system.

sinus

Hm, thanks.
750 GB tells not a lot about the number of photos, because you know, a picture can be e.g. 100 KB or 20 MB.
But let´s say, each pic is 2 MB, than you would end with 375´000 image, if I made not a mistake in my calculation. 8)

That is a lot, specialy for an amateur, what usually a relative is.

If the folders like
Sally & Bill & Jack
Bob & Jack & Ron & Tom & Don 

has really only images from its title (Sally & Bill & Jack), then I would simply select all images and add this names into metadata-tags. Then I would add in a metadata-tag the number of people, like here 3 (or maybe three).

If you have such a number of different folder-names, then I guess, they are quite correct, that in a folder with "Sally & Bill & Jack" are really only photos of these 3 people.

Then later you can simply search for "only Sally and Bill", if necessary with the number of persons, and create a better folder structure of whatever.

And such a search is very fast (except the very first one), not depending if you searches for filenames or tags.

But your situation seems to be so difficult, that I guess, this is also not a good option for you.
As you pointed out, the proposals from Mario are maybe good ways.

 
Best wishes from Switzerland! :-)
Markus

Mario

Quote from: DSR on May 26, 2023, 09:29:22 PMNow that said - do many users (who manage their pics locally and not on an online platform) use shortcuts to create albums?  i have no idea, but i kind of doubt it.
What do you mean by album and shortcut?

QuoteA count probably solves that.
It will. The solution I provided above works fine.
It assumes that information about which persons are in an image is available. This is automatic when you use IMatch's face recognition features.


QuoteNow, that said, tag searching via exif info is time-consuming vs searching on file names

Have you experience with a DAM like IMatch?

Because IMatch caches all metadata tags a file names and a lot more in the database. It does not need to access the actual file to perform searches.
There is no difference in search speed between file names and metadata tags.
The File Window search bar in IMatch searches 50,000 files in less than a second, for example.

QuoteTags stored in a DB is probably a better if not more fragile solution.
You are misinformed. A DAM like IMatch stores tag values like keywords, titles, descriptions etc. in the database for speed. And also in the file itself, in a standardized metadata format like XMP, GPS, EXIF etc.

See Metadata for Beginners in the IMatch Help System for an overview and introduction.

QuoteBut to get going, if searches arent prohibitively slow, it's a small price to pay for a much more organized system.
IMatch avoids many reasons for searching by organizing your files automatically by criteria like location, persons shown, events, technical metadata like lens, make, model. Also by workflow tasks like "missing keywords", "no title" etc.

When you have to search, the typical search performance for 100,000 files is < 2 seconds.
That's for the File Window Search Bar.
You can also also filter using the Filter Panel, which provides a large number of specialized filter options, from file format to metadata, persons, events and more.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Tveloso

Quote from: DSR on May 26, 2023, 07:25:50 PMAnd in each folder are copies of pics from the timeline. The Picture store is ~10x what it should be if organized correctly.
Does this mean that there are a lot of duplicates in that folder structure, intentionally created based upon the different combinations of the people the photos show?

If so, then the first order of business might be to clean out the duplicates.  IMatch will automatically identify duplicates as the files are being indexed (there are several settings that control how that is done), but then you should maybe not immediately remove the duplicates, since it sounds like they may provide some person identification for the files that you will be keeping, that those files will initially lack.

For example, is it true that some of the files in folder Peter and Susan have been copied into folder Peter and Bob & Susan, because they also contain Bob? It will be trivial to assign the persons Peter and Susan to all of the files in the first folder (whether via Person Links or Categories), but then if some have copies in the second folder, then that subset also needs Bob assigned.

Perhaps there's a way to use the folder filter to help with the task of collecting all the people in each photo...

Once that task is complete, using option 1 that Mario described will work perfectly.

Another option might be to abandon the information implicit in the folder structure, and use IMatch's Face Recognition to get the people information for the files.

Either way, this sounds like it could be a bit of work...(but IMatch provides many tools to get it done!)
--Tony

DSR

let me clear up a few points raised in the past few responses:

1. "What do you mean by album and shortcut?" 

i meant rather than having copies of files all over the place, replace those copies with shortcuts. Not ideal, but would certainly use much less disk space (important if you're using online backup with slow upload speeds).  Replacing dupes with shortcuts to originals in a timeline might end up being part of this project - for collections of say inanimate objects, certain themes, etc.  Probably not, but i was just wondering if it's common.

---------------------------------------------------------------------------

2. "Because IMatch caches all metadata tags a file names and a lot more in the database. It does not need to access the actual file to perform searches."

I haven't played with Imatch yet, so wasn't sure if it was a DB based system.

3. "Does this mean that there are a lot of duplicates in that folder structure, intentionally created based upon the different combinations of the people the photos show?"

yes...it's a mess.  Using a set of dupe and folder comparison tools, i've shrunken the picture store by 60% and created timelines (ie. 20121031_Halloween).  While some pics have date taken info, many don't (typically they're screenshots).  And many were taken with cameras where date/time wasn't correct.  And within any one folder in the timeline, there could be hundreds of pics with completely disparate names.

----------------------

Bottom line is that i've cleaned out much of what was there.  I've flattened most of the nested folders, incorporated what i could into timelines,  renamed swathes of files either by date_time.* or date_###.*

Some of the folders, i'll ultimately leave alone (for now, if not forever) because there's just too much to do.

I'll let face recognition do most of the heavy lifting and hope that going forward the user will learn this new religion and organize new stuff correctly.

The only major issue i'm grapping with right now is the boolean search point i first raised. If i have a good feeling that we can do stuff like "bob & "sally" ONLY, i can take apart and file away all those pics into the time line (many of those pics are already dupes so that means deleting them).

I have a few more weeks of cleaning up and then backing up and will play with Imatch.  I had been playing with a different face recognition system, but the search capability isn't robust enough.













Mario

Quotei meant rather than having copies of files all over the place, replace those copies with shortcuts.
If you only ever use Windows file system tools, replacing copies of images with links is an option. But a horrible one. I would not even consider such a thing. Doing all this in the file system is the wrong approach.
If this is about cost, check out one of the free DAM systems available.

DAM systems like IMatch are designed to deal with all that.

QuoteI haven't played with Imatch yet, so wasn't sure if it was a DB based system.

Hm. Use the free 30-day trial version to learn about what IMatch does. This will immediately change how you think about your project. Watch some of the free IMatch tutorial videos: https://www.photools.com/imatch-learning-center/

-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

DSR

plan to once I get the initial cleanup done and backed up.  thanks

axel.hennig

I would go with Alternative 2 from Marios first post (this describes how I do it).

There is an older forum post dealing with that problem.

It is also described in the help several times: herehere and here.

sinus

Quote from: Mario on May 27, 2023, 04:28:11 PM
Quote..
Hm. Use the free 30-day trial version to learn about what IMatch does. This will immediately change how you think about your project. Watch some of the free IMatch tutorial videos: https://www.photools.com/imatch-learning-center/



I agree completely. I had also such relatives (well, at least two), what had also such things like DSR described.
But it was quite easy to solve with IMatch. 
Best wishes from Switzerland! :-)
Markus