Should I write my category data to images in iMatch 5 or iMatch 3?

Started by lanerellis, January 12, 2014, 07:24:44 PM

Previous topic - Next topic

lanerellis

 Hello fellow IMatch users,

I've been using IMatch since 2007. I have over 80,000 images in my database, and 25,000 categories.

My primary goal for using IMatch has been rather simple — to use categories to record the "who, what, when, where, and why" for each image.

The bulk of my images are genealogy-related. After six years work I've finally categorized each image within IMatch.

My goal now is to copy the internal-to-IMatch category data into the image files themselves, to have it safely stored there for long after I'm gone in the hope that it will be readable by a variety of software and online services in the future.

Once my image files physically contain the IMatch categories I've assigned them, I plan to upload them to a private area on SmugMug.

My first step, which I wanted to ask about here, is to decide whether to go ahead and copy my image category data into the image file structure using IMatch 3, or wait until IMatch 5 is out of Beta and do it then within IMatch 5?

On the one hand I suppose that I could just use one of the IMatch 3 convert-categories-to-keywords scripts, upload my images to SmugMug, and patiently wait for the release version of IMatch 5 before moving to it.

On the other hand I see that my IMatch 3 database might be an easy one to eventually convert to IMatch 5, and wonder if there could be benefits to copying my image category data into the files' structure all within IMatch 5, rather than doing it in IMatch 3 and then later converting to IMatch 5?

I am trying to be extremely cautious and have been moving very slowly with this process.

Although I haven't tested IMatch 5 on any of my machines yet, I have avidly followed the discussions here on the new forums since it began.

If you were in my situation, what might you do?

I thought it best to ask here in the IMatch 5 forums, as my question involves whether to do what I need to do in IMatch 5 or IMatch 3. My apologies if I've chosen the wrong forum to ask this advice.

Thanks for any input and suggestions!  :)

Best Regards,
Lane R. Ellis

Ferdinand

Something else to think about.  Once IMatch 5 is out of beta, will you want to manage your categories as IMatch Categories or @Keywords?  I assume you understand the difference if you've been following this forum.

If you plan on using @Keywords, then the question is how to migrate your categories there.  I've written a script to do this, and there is a version for both IMatch 3.6 & 5, so it could be done in either version, although personally I think doing it now in IMatch 3.6 makes most sense.  But test extensively first using duplicates.  Once the migration is done you can remove the categories and manage the information directly through @Keywords

I think the migration can also be done in IMatch 5 using a metadata template, although I haven't tried this myself.  The same comments about testing with duplicates first still apply, as this can be complex to configure.

In your case it seems to me that @Keywords would make most sense.  Just be aware than restructuring the hierarchy is more difficult, as it means editing the metadata in the files directly.

If you want to keep managing your categories as IMatch Categories in V5 and write them to the image, as many people did in V3.6, then you'll still need to use either a script or a metadata template.

Richard

Hi Lane,

Using IMatch 3.6 I kept everything I knew about a family member in IPTC. I have now decided that was not a good move. It makes personal information easy for others to harvest. When I make the final move to IMatch 5, all that information will be put in Attributes and Categories where it will easier to keep private. Dates, names, places, and more should not, IMO, be stored with the image file if that image file could be obtained by  non-family members.

sinus

Quote from: Richard on January 13, 2014, 02:57:37 AM
Hi Lane,

Using IMatch 3.6 I kept everything I knew about a family member in IPTC. I have now decided that was not a good move. It makes personal information easy for others to harvest. When I make the final move to IMatch 5, all that information will be put in Attributes and Categories where it will easier to keep private. Dates, names, places, and more should not, IMO, be stored with the image file if that image file could be obtained by  non-family members.

Hi Richard
But it was a quite safe move. Having data in IPTC, means, that the informations are in a quite safe place.
If you go with categories or properties, all is in the db of IMatch.

Of course, for people like you, who knows, what they do, it is in fact not a really important difference.
Because, BEFORE you deliver images to other peoples, you can delete the IPTC-informations from the image, or you can transport the information from categories and properties into the image, if you want do that.

It depends, where you send or gives your images and if they should be able to read, what in the image is (persons and so on).

If you want plan to offer the informations to other people, like Lane does ...

My goal now is to copy the internal-to-IMatch category data into the image files themselves, to have it safely stored there for long after I'm gone in the hope that it will be readable by a variety of software and online services in the future.

Once my image files physically contain the IMatch categories I've assigned them, I plan to upload them to a private area on SmugMug.


... then it is a good thing, to have the informations inside the image.

But of course, you are right, if there are "sensible" informations there, then we must be very carefully, that these informations does not flow into the image.

This is a reason, why I will use also in IM5 the Attributes and Categories.

Best wishes from Switzerland! :-)
Markus

Richard

Hi Markus,

I can imagine sending an image file to a relative and if that file contains the subject's name, etc.  That is fine unless that relative uploads the file to say Facebook. I do not want to take that risk so I will remove all data from my files. It is easier to do that then it is to try to ensure that everyone understands what can happen if they send my file to someone else or post it somewhere.

sinus

Quote from: Richard on January 13, 2014, 12:11:46 PM
Hi Markus,

I can imagine sending an image file to a relative and if that file contains the subject's name, etc.  That is fine unless that relative uploads the file to say Facebook. I do not want to take that risk so I will remove all data from my files. It is easier to do that then it is to try to ensure that everyone understands what can happen if they send my file to someone else or post it somewhere.

That is of course true.
People does not understand always, what can happen, if a file is on the net.

I think, Lane want to upload images with informations to a private section of SmugMug and should be aware, that even private sections are not really safe.

But in your case, I wonder, if you send an image, say with 5 people on it, and you send it to your relatives, where do you want to add the names of the persons, who are on the image?

If you put it inside the image, then the danger is, like you wrote, that this image goes on the net.
If you write it on a separate file, this danger is not that high, but of course a relative could also add the names in the file and upload it.

Hence, I guess, every time, you give an image out of your hand, there is a danger, sometimes higher, sometimes it is a little risk.

Maybe you could add a big warning (with some explanations) with each image? Of course, if you give your images "only" to your relatives, then finally the risk is less high, because they usually know you and you can give a phone or so.
Best wishes from Switzerland! :-)
Markus

sinus

Quote from: lanerellis on January 12, 2014, 07:24:44 PM
My goal now is to copy the internal-to-IMatch category data into the image files themselves, to have it safely stored there for long after I'm gone in the hope that it will be readable by a variety of software and online services in the future.

Once my image files physically contain the IMatch categories I've assigned them, I plan to upload them to a private area on SmugMug.

My first step, which I wanted to ask about here, is to decide whether to go ahead and copy my image category data into the image file structure using IMatch 3, or wait until IMatch 5 is out of Beta and do it then within IMatch 5?

On the one hand I suppose that I could just use one of the IMatch 3 convert-categories-to-keywords scripts, upload my images to SmugMug, and patiently wait for the release version of IMatch 5 before moving to it.

On the other hand I see that my IMatch 3 database might be an easy one to eventually convert to IMatch 5, and wonder if there could be benefits to copying my image category data into the files' structure all within IMatch 5, rather than doing it in IMatch 3 and then later converting to IMatch 5?

I am trying to be extremely cautious and have been moving very slowly with this process.
Best Regards,
Lane R. Ellis

I am not sure also.
To be honest, asking here is a good step.  :)
And if I where you, I would wait on IM5.
In the meantime, if you have time, you could do some tests.
Like use a script in IM3, to put your cats into, say, 10 files.
The same 10 files you could add into IM5, put the same cats as in IM3 to them and then look at the difference.

I think, IM5 has a much more focus on xmp, then IM3 does.

At least, if you are that cautious as you have been, then you will be on the safe side.

I have a IM3 - db with about 180'000 files. I do not know exactly, what I will do, to be honest.
I will wait on IM5. After it is out for the public  :)
I will carefully do some tests with converting, with add the images directly into IM5 (my most important informations are in the IPTC) and maybe with other stuff.

At the moment I guess, finally I will convert the db into IM5, and then I will do there some quite big "finetunings".
Best wishes from Switzerland! :-)
Markus

Mario

Just to mention that for the search engine  ;)

The Batch Processor allows you to selectively copy metadata into the output files - or none at all.

This supports the typical scenarios where a user

- is sending files via email
- prepares files for uploading to public web sites (or harvester sites like FaceBook, Picasa etc.) or
- hands over files to clients, friends or other parties

You can produce clean files without any data, or only selected data like some EXIF fields or the common "copyright info only" tags.

Metadata Templates also allow you to selectively remove or replace data in files.

And the ExifTool Command Processor can be used to strip all metadata from a file safely via ExifTool.

Privacy is definitive an issue these days, not only because the NSA affairs but also because it is so profitable to know everything about everyone and to collect and sell this data. The social networks don't exist for our pleasure!
Data austerity is a valuable principle.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

BenAW

Quote from: lanerellis on January 12, 2014, 07:24:44 PM
Hello fellow IMatch users,

I've been using IMatch since 2007. I have over 80,000 images in my database, and 25,000 categories.

My primary goal for using IMatch has been rather simple — to use categories to record the "who, what, when, where, and why" for each image.
Just curious about the number of categories. I have the same basic categories as you do, and I have well below a thousand categories. You have about 1 category for every 3 files!

Regarding the writing of categories into the files: I would only consider this when I'm absolutely sure the category structure doesn't change anymore.
I have countrycode, country, city and location in IPTC, and create a four level data driven category from this.
Also datadriven categories for camera and lenses.
The rest is in normal categories. I actually removed the keywords I had in IM3 because my category setup had changed quite a bit.
Maybe later I'll write some selected data into the keywords again.

Richard

Quote from: sinus on January 13, 2014, 12:28:47 PM
I think, Lane want to upload images with informations to a private section of SmugMug and should be aware, that even private sections are not really safe.
Hi Markus,

It is not so much SmugMug that concerns me. It is what others do with files I have given them access to. If someone adds information to the file later and then makes that file accessible to the public, that is not my fault. I spent years and thousands of dollars gathering family information and assembled it into five binders which I gave to five people who had helped by providing some of the information. Even though it was copyrighted a cousin took it upon herself to make a copy and give it to my brother who had not provided one bit of data on his own family. The point being that it is darn hard to control data once it leaves my control so I will do my part by limiting what goes into a file.

sinus

Quote from: Richard on January 13, 2014, 02:52:42 PM
I spent years and thousands of dollars gathering family information and assembled it into five binders which I gave to five people who had helped by providing some of the information. Even though it was copyrighted a cousin took it upon herself to make a copy and give it to my brother who had not provided one bit of data on his own family. The point being that it is darn hard to control data once it leaves my control so I will do my part by limiting what goes into a file.

That is very sad. Sometimes they do simply not know, that informations (images or words) can be private and should not given to other people.
Or they want not to know or simply ignore it. Oh, and of course, there are people out there, who do know exactly what they do, but want to harm someone.

To avoid such things, I used to write in BIG, very big letters, that an image or informations is ONLY private and must not given to other people or even shown them.

No chance: there are always some, sorry, stupid people there, who are not interested in such warnings.
Hence it is wise, what you wrote: limit the material, what we give away and limit the people, for whom we give informations.
Best wishes from Switzerland! :-)
Markus

lanerellis

Thanks to you all for the helpful ideas and insight on whether I should convert my categories into the image file structure using IMatch 3 or wait and do it within the eventual release version of IMatch 5.

This is my primary issue. As a one-time professional genealogist I've already dealt with and implemented my own systems for dealing with data privacy both online and off. I'll address each issue separately here.

CATEGORIES TO FILE DATA ISSUE

One choice may be to stay with IMatch 3, as for my purposes it does what I need, however I recognize that doing so would mean working with essentially an outdated and abandoned product, so I'm planning to buy IMatch 5 and made the switch. I love IMatch and want to support Mario and the community.

Perhaps I'll try to keep my workflow in IMatch 5 as similar as possible to what I do in IMatch 3: Assigning every image categories for the who, what, when, where, why, and so forth.

I know that IMatch 5 has an astounding array of options when it comes to recording information about images, so I'll continue to study and weigh the advantages of the various methods. Mario, thanks for mentioning the variety of IMatch 5 batch processing options and metadata templates.

Since I've been using IMatch categories for seven years, I have a strong preference for sticking to this method and using the power of IMatch 5 to write that information into my images' file structure.

I will have to study whether a move from the familiar-to-me category system to a keywords system would be worth it, or perhaps be unnecessary for my needs. It sounds, however, as though IMatch 5 has strong incentives for making image data through the new @keywords and XMP system.

Ferdinand, thanks for your input on this matter. Yes, perhaps I'll need to use either a script or a metadata template to write my categories to my images in IMatch 5, assuming I don't decide to do it within IMatch 3.

Sinus, thanks for your good advice and thoughts on the matter too. I will probably take your advice and wait for the release version of IMatch 5 before migrating and finally writing my category data to my files, and I like your idea -- and Mario's longstanding recommendation -- to thoroughly test various options using a small subset of images. My hope is that IMatch 5 will make is relatively easy for folks like me to still use categories and at the same time have the program take care of XMP compatibility issues.

GENEALOGY & DATA PRIVACY ISSUES

As I mentioned, I already have my own systems in place that I've used since I began my genealogy database in 1994.

Richard, thanks for your thoughts. You may remember that 12 years ago I switched my genealogy database program to The Master Genealogist, and it was through you that I discovered IMatch in 2007. I've pretty much given up on any hope that TMG will ever incorporate cutting-edge image information management standards, so anything TMG programmer and CEO Bob Velke decides to add in this area will all be a bonus to me. I still love TMG and plan to use it for the rest of my life, however, just like IMatch. :-)

I work in social media for a living, as the lead editor for one of the world's top technology conferences, and my concern for genealogy data has for the past decade or so been more about how to perpetuate and preserve the research I've done once I am gone. On the digital image side of things, I cringe at the thought of someone's lifetime work of identifying and cataloging photos and scanned documents being locked away into an IMatch or any other image database system, with no family member, estate personal representative, or friend able to know how to access or pull out the data, which is why my efforts are towards writing all I can into image file structures. Keeping those images safe is another mater.

I've had a stripped-down version of my 20,000-plus person genealogy database publicly online since 2002 (using John Cardinal's Second Site software, RootsWeb, Ancestry.com, FamilySearch, and others), and of course it's a double-edged sword -- my lovingly researched data has been scraped countless times and re-shared as the work of others, but I've also made an astounding number of important new family connections from having my research online.

For my images, I don't plan on making them public -- although I must say that this is a growing trend among younger people and that the crowd-sourcing aspects of doing so are at times enticing -- and instead plan to go with private password-protected SmugMug cloud storage. I've been managing online communities since I first opened a computer bulletin board system 30 years ago in 1984, having worked in the technology industry since then, including many years was a Web engineer managing Web servers, so I'm aware of the limited security of services such as SmugMug, but I think the advantages to me outweigh not putting a copy of my images online in private galleries. Mainly these galleries will be for me a form of cloud backup, but I would also like to have the data I've recorded about each image available to share with those interested family members who I give access.

Over the past several years I've studied how social media websites handle IPTC, EXIF, and XMP data -- an ever-changing situation, unfortunately. IPTC.org's March 2013 study offers a glimpse (see http://www.embeddedmetadata.org/social-media-test-results.php ), although somewhat outdated by now. You may be interested to know that Facebook has traditionally stripped out metadata. The main battle for some of us has been to get social media sites to preserve our metadata, or ideally let us decide what to include.

I've shared several thousand family photos publicly online over the years -- ones I've tracked down and scanned personally from aged relatives across the country -- and although initially I have to sigh a bit when others share them uncredited, this practice has brought about some amazing additions to my photo archive, as folks have seen what I have and offered up their own treasured photos to me.

Richard, I was a bit apprehensive initially choosing to record my image information in IMatch using categories, because you were using IPTC and I thought that maybe I should be following your method at the time.

I'm glad that IMatch -- especially IMatch 5 -- makes it relatively simple to decide how much image information, if any, to include with images we wish to share with others.

BenAW, you asked about my 25,000 categories for 80,000 images. When I began using IMatch I believe that I used a script or other import routine to pull in the names of all of the 20,000 or so people in my genealogy database, so only 5,000 or so of my categories are not the names of my relatives. It has been helpful to rarely have to manually enter the name of a relative in a photo in my collection, but it may have been overkill, as I don't have photos for perhaps 80 percent of my 20,000 or so relatives, at least yet. :-)

Thanks again for the discussion, and I welcome any further input and advice.

Cheers,
Lane R. Ellis
Lead Editor, Pubcon
@lanerellis Twitter

Mario

A matter to consider when deciding between using in-file keywords and categories is performance.

IMatch can assign 1,000 files to a regular category in about one second. IMatch can move 1,000 files from one regular category to another in about the same time. Moving a regular category from one parent to another parent takes only a few seconds. All these are highly optimized in-database operations.

Assigning 1,000 files to a @Keyword category not only assigns the files to the category, but also has to add the name of the category as a keyword to 1,000 files!

This means loading the hierarchical keywords for 1,000 files from the database. For each file, check if the new keyword already exists, and if not, add it. Copy the new keyword also to the flat dc:Subject keywords and IPTC keywords (depending on the file format and metadata settings). While doing this, apply the flattening rules set by the user under Edit > Preferences > Metadata. Finally, write the keyword data for 1,000 files back into the database. Invalidate all potentially affected data-driven categories. Invalidate several collections which may be affected. Update the file history data for 1,000 files.

And of course, the updated metadata has to be written back to the image file or an XMP sidecar file afterwards. And this is a rather slow process, depending on the file format, the location of the files, etc. Updating IPTC and XMP data. Re-calculating the digest data, setting timestamps etc. to produce high-quality metadata. Re-load the resulting changes in the file back into the database...

Although IMatch is fast doing all these operations, using @Keywords is much. much slower than normal categories.

IMatch 5 uses have the choice. I cannot say if using regular categories is sufficient, or if keywords are needed. Or if only some  data should go into keywords, and other data is better left to regular categories. As a rule of thumb: If you plan to change the data often, regular IMatch categories are the better choice. You can later copy/paste the files to a category under @Keywords later if needed.

IMatch has large user groups who never use in-file metadata in their normal workflow. They use only IMatch categories and Attributes (or properties in IMatch 3). This keeps the data save inside the database and is the fastest possible workflow. These user groups welcomed the new Attribute system in IMatch 5 with a very big cheer  :)

If they really need metadata in a file for some reason, they run a script which copies exactly the data they want to a copy of the file. In IMatch 5, they don't need the script anymore. They just run a Metadata Template on the files produced by the Batch Processor.

Needless to say that IMatch has many ways to export all the data in your database, even to 'standard' formats like XML or CSV. Or via a script into every format you can imagine or need.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

sinus

Quote from: lanerellis on January 14, 2014, 07:00:47 AM
For my images, I don't plan on making them public -- although I must say that this is a growing trend among younger people and that the crowd-sourcing aspects of doing so are at times enticing -- and instead plan to go with private password-protected SmugMug cloud storage.
Thanks again for the discussion, and I welcome any further input and advice.
Cheers,
Lane R. Ellis
Lead Editor, Pubcon
@lanerellis Twitter

Hi Lane
Thanks a lot for your interesting post. Making images and text public is a trend, I guess, yes. In the case of geneaology I think, that could really be a interesting step to go, because of get more informations.
My brother does a kind of geneaology, but only private, and I see, how it is difficult, to get some informations. The net is great, with big advantages, but of course also big dangers.
Best wishes from Switzerland! :-)
Markus

sinus

Quote from: Mario on January 14, 2014, 08:37:08 AM
A matter to consider when deciding between using in-file keywords and categories is performance.


Ah, yes. That's a point!
Finally, if I got is right, Lane could handle IM5 almost in the same way, as she/he (?) does with IM3. So your performance-points would really show the direction to go with only categories.
If necessary, we can use the power of IM5 (including scripts) to simply add these cats or fraction of cats into the files.
Best wishes from Switzerland! :-)
Markus

BenAW

Quote from: lanerellis on January 14, 2014, 07:00:47 AM
BenAW, you asked about my 25,000 categories for 80,000 images. When I began using IMatch I believe that I used a script or other import routine to pull in the names of all of the 20,000 or so people in my genealogy database, so only 5,000 or so of my categories are not the names of my relatives.
That explains it  ;D
I'm looking at a way to have the names of the people in an image in the metadata, in such a way that I can create a custum datadriven category from that info.
The XMP field  "Person in image" seems like the right place to do that. article
This makes it possible to create a hierarchy under Who that you can set up to your own liking, without cluttering the @keywords hierarchy.

Richard

QuoteYou may remember that 12 years ago I switched my genealogy database program to The Master Genealogist, and it was through you that I discovered IMatch in 2007.
Hi Lane,

I wish that I could say that I remember but I have trouble remembering when I did things. Something I have hoped for is that someone with programming skills would develop a means of integrating IMatch and TMG so that a TMG user would have the power of IMatch for handling family related files. With IMatch 3 it would have been just a script but IMatch 5 opens the door to even more possibilities.

lanerellis

 Hello again IMatch folks,

I'm still heeing and hawing as to whether I'll stick with using categories or switch to using @Keywords.

The way I've always worked in IMatch 3 is to assign new images to categories, after which I never seem to make any changes. I organize my files this way and then for me they are ostensibly done.

I've read the guides, help files, and forum posts that compare working using categories with using @Keywords, and I still haven't decided which will best suit me.

My main concern over which method to use is about performance.

In 2012 I built a very fast computer system geared to video and graphics work, so I'm curious whether I'd even notice much of a performance difference in my workflow. My system consists of:

Intel Core i7-2600K Sandy Bridge 3.4GHz LGA 1155 95W Quad-Core CPU
ASUS MAXIMUS IV EXTREME-Z USB 3.0 motherboard
EVGA NVIDIA GeForce 1.2GB 320-bit GTX570 (Fermi) video card
16GB G.SKILL RIPJAWSZ 16GB (4x4GB) DDR31600 PC3-12800 RAM
Crucial M4 2.5" 128GB SATA III SSD boot drive
2TB Western Digital Caviar Black 7200 RPM 64MB Cache SATA 6.0Gb/s hard drive
2nd 2TB Western Digital Caviar Black 7200 RPM 64MB Cache SATA 6.0Gb/s hard drive
Pioneer BDR-206DBKS SATA Blu-ray Burner
COOLER MASTER HAF 932 Advanced Blue Edition full tower case w/USB 3.0
SILVERSTONE 750W PLUS SILVER Certified Modular Power Supply
Koutech Multi-in-1 USB 3.0 Multifunction Front Panel Card Reader w/eSATA/HD audio
Windows 7 Professional 64-bit

Given that I tend to set the who, what, when, where, and why or an image just once and then leave it alone, maybe working with @Keywords would be best for me. If I were to switch to using @Keywords, I imagine that initially it would take a solid overnight batch run to write my existing category information into IPTC fields for all of my 90,000 or so images, however after that I'd only be adding between 1 and 1,000 new images at a time, and usually around an average of 60 a day.

Writing @Keywords information to my image files for only 60 or so files shouldn't cause much of a performance issue to worry about on my fast system, and I'm a rather patient person anyhow, all of which makes me wonder whether @Keywords is the way to go for me.

Mario has noted in this thread that, "As a rule of thumb: If you plan to change the data often, regular IMatch categories are the better choice." I don't plan to change the data often, but perhaps regular IMatch categories would still be the best option for me, since I'm used to using them in IMatch 3.6, and usually prefer using the fastest performance methods possible, which points to using categories.

As I've mentioned in this thread, however, I do want most of my image information written to IPTC in my files for purposes of leaving a digital legacy in which future family history researchers will be able to have access to what I know about each image without having to use any proprietary software -- also why I'll be uploading my files to password-protected directories on SmugMug, in addition to private cloud-based backups as well as multiple local backups.

I need to decide whether to use @Keywords and take a performance hit, or stick with categories and only write IPTC data to my images when needed, such as once for all my existing images to be uploaded to SmugMug, and then with every batch of new images I acquire. Maybe that @Keywords performance hit wouldn't be great enough to worry about on my system, but I don't know yet.

I'm curious whether others of you with databases over 90,000 images have decided to use IMatch 5 @Keywords or categories. Has there been a poll yet asking whether IMatch 5 users are working with categories, @Keywords, or a combination of both?

I suppose I should do some performance testing on a sample database, but from what I've read I'm leaning towards going with a @Keywords-based workflow.

It's a very exciting time to be an IMatch user, and I'm so thankful to Mario for this stellar product, and to all the helpful people who share information on the forums.

Thanks again for the discussion, and I welcome any further input and advice.

Cheers,
Lane R. Ellis
Lead Editor, Pubcon
@lanerellis Twitter

Mario

QuoteI'm still heeing and hawing as to whether I'll stick with using categories or switch to using @Keywords.

There are some simple things to consider:

1. If you want to make your "categories" available in other applications, web services, photo upload sites, use "real" keywords and let IMatch mirror the keywords you add to your files in @Keywords automatically.

2. If you want to keep information private, use categories outside the special @Keywords hierarchy, or just don't add keywords to your files.

3. Use a mix of both. IMatch is flexible so you can be to.

Performance:

Native IMatch categories are unbeatable fast. Fact.
If you add keywords to your files in the Keyword Panel (primary feature to work with keywords), IMatch writes this data into the database and then adds/removes matching child categories under @Keywords. This causes a small overhead, but not much.

The real slow part when using keywords instead of native categories is that IMatch has to write the keywords back to the file at some point. This can take some time, especially if your files are large or on a slow medium. ExifTool uses a "better safe than sorry" approach when writing data - which is a tad slower but way more safe. If you edit metadata anyway, e.g. adding captions or descriptions, there will be no performance decrease by working with real keywords. If IMatch has to write 5 tags or 50 makes no difference during the write-back.

Since IMatch allows you to control when you write back metadata, you can write back large number of files when you have something else to do. Large number here means hundreds of thousands of files.

I personally use a mix of keywords and categories. I maintain my hierarchical keywords in the IMatch thesaurus and use these in the Keyword Panel. Things like projects, clients, family relations for personal photos etc. is managed using a set of 11 or 12 native categories, plus child categories. This allows me to control which information is present in my image files and what can be seen from the outside.

The beauty with IMatch 5 is that it does not force you to use one or the other method. You can mix them, and change in mid-race if you want.
I suggest you create a test database, add copies of some of your folders and try things out. Works pretty well.

One of my local testers (the ones I can talk to in person) has just cleaned up the keyword mess he accumulated over the past decade and multiple RAW processors and image editors. He decided that IMatch 5 is good for production and that's that.

He added all live files to IMatch, uses the Keyword Panel and a Thesaurus to add/remove/swap keywords in his files as needed. He used a couple of metadata templates to clean up other data as well (a very powerful aspect of metadata templates which is often overlooked). The entire "project" was done in less than four hours - including learning more about the Keyword Panel and metadata templates. 4 hours, about 20,000 files cleaned up. Add a couple of hours for writing back all files and problem's solved. Another IMatch 5 license sold, I'm confident  ;)
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

sinus

I can not beat Mario's detailed answer, since for me performance is not THAT important (though I have not half such a good computer than you), I go the @keyword - line.

In your case, if I read your sentences, I would go this line "...or stick with categories and only write IPTC data to my images when needed"

But it is up to you. I am sure, you will decide the right solution for YOU.
Best wishes from Switzerland! :-)
Markus

JohnZeman

Quote from: lanerellis on April 10, 2014, 03:30:33 AM
I'm curious whether others of you with databases over 90,000 images have decided to use IMatch 5 @Keywords or categories. Has there been a poll yet asking whether IMatch 5 users are working with categories, @Keywords, or a combination of both?

Not 90,000+ images, but I'm just about to 50,000 images now and I've been successfully using @keywords instead of standard categories for months.  Like you, I don't need to change the metadata often so it's not a problem for me to write the new changes to file occasionally, I tend to do that at least once a day just out of habit.

Besides @keywords, I have a lot of other data-driven and formula categories and IMatch 5 has no performance problem managing them all.  The only standard categories I have are temporary ones I use for testing during the beta period.

sinus

Quote from: JohnZeman on April 11, 2014, 12:03:35 AM
Quote from: lanerellis on April 10, 2014, 03:30:33 AM
I'm curious whether others of you with databases over 90,000 images have decided to use IMatch 5 @Keywords or categories. Has there been a poll yet asking whether IMatch 5 users are working with categories, @Keywords, or a combination of both?

Not 90,000+ images, but I'm just about to 50,000 images now and I've been successfully using @keywords instead of standard categories for months.  Like you, I don't need to change the metadata often so it's not a problem for me to write the new changes to file occasionally, I tend to do that at least once a day just out of habit.

Besides @keywords, I have a lot of other data-driven and formula categories and IMatch 5 has no performance problem managing them all.  The only standard categories I have are temporary ones I use for testing during the beta period.

I do not work with my "real database" now with IM5 (only testing), but I have decided, I will work with a mix, a lot of @keywords and a kind of "step to step" - guideline with "normal categories".
Best wishes from Switzerland! :-)
Markus

lanerellis


Hi Folks,

Thanks so much Mario, sinus, and JohnZeman for your helpful advice.

Mulling over what has been written about using the @Keywords feature, especially some of what Mario has shared, I decided I'd like to use it or at least try it and see how I like it.

Last night I posted a bit about the issue of taking my IMatch 3.6 categories and letting @Keywords do the work of writing to files for me, in the "Re: Categories to Hierarchical Keywords Script" thread ( https://www.photools.com/community/index.php?topic=312.0 ), but I'm moving my reply here since I think it is less related to Ferdinand's script and more a general IMatch discussion topic.

I've been impressed with Mario's high regard for the @Keywords feature, including in the thread I mentioned where he wrote, "The @Keywords category has been created to combine XMP hierarchical keywords and IMatch categories. This is the way to go if you want to use categories like keywords and you want to exchange that data seamlessly with other XMP applications which handle hierarchical keywords. Use it. Don't work against it," and "[...] use @Keywords. It's automatic and easy."

Yesterday I'd decided that my computer is fast enough to make up for the performance hit using @Keywords creates compared to just internal categories, and since one of my main goals is to take the 90,000+ images I've put into 25,000+ IMatch categories over the past seven years and upload them to password-protected galleries on SmugMug, I thought I'd go ahead and convert my 3.6 database to 5 to start using @Keywords.

From discussions in the other thread it seems that my hopes of @Keywords being able to make use of my existing IMatch 3.6 categories easily was premature, and it looks like the only way a 3.6 user who has used the excellent categories features can now move to using the powerful IMatch 5 @Keywords feature is through extremely complicated third-party scripts or perhaps through the program's metadata templates.

Ferdinand noted that "metadata templates are a more elegant way to write categories to @keywords," but also noted that doing so might be "hard to avoid multiple rounds of writebacks."

Having spent seven years getting all my images categorized in IMatch 3 and having recently subscribed to SmugMug, my hope has been to simply write every category I've assigned to each of my images to the files via the IPTC keywords field, and thought such a task would surely be easy in IMatch 5. As I've mentioned, I wasn't sure whether it would be best to do this in 3.6 or in 5, but with all of the excitement about the @Keywords feature of 5, I thought I'd move to 5 and try it, only to discover that there's apparently no easy way to do so.

I'm a bit frustrated that writing the categories I've assigned my files to into the IPTC keywords field isn't something I can do with IMatch 5, unless I first use either a script like Ferdinand's or IMatch 5's metadata templates to write my categories to my files before the @Keywords feature will see my already-existing categories and work for me.

I suppose I'll now have to study Ferdinand's script more carefully (I've followed it over the years in IMatch 3) and debate whether it or IMatch 5's metadata templates would be the better choice in my situation.

I think my needs are rather straightforward: for every image in my database write every category that I've assigned the image to into the IPTC keywords field.

I know there are nuanced concerns about such data moves, however, based on the different ways we all use and assign categories in IMatch.

My category structure uses the following main category trees:

What
When
Where
Who
Why / Occasion
Actions
Documents Scanned
Emotions
Photographer
Rating
Source
To Sort

Is there an easy brute-force method to just write every image's categories to an IPTC field? For my purposes of uploading to private SmugMug galleries as an extra form of backup and in some cases to share with my close family members, I don't mind if there may be more category information than might be ideal written to the IPTC keywords field -- at this point I'd simply like to get my files uploaded at SmugMug and then, later on, work on a pristine and perfect mapping of just the categories or portions of them I'd ideally want.

Perhaps I'll reconsider the use of @Keywords and just stick to my tried-and-true internal category workflow, and come up with a way to use Ferdinand's script or IMatch 5's metadata templates feature to every so often write my categories to IPTC fields. I sure like the idea of @Keywords, but had hoped it could be easily populated with my existing IMatch 3.6 categories.

Thanks again to all of you for your excellent input and advice.  :)

Cheers,
Lane

JohnZeman

Quote from: lanerellis on April 13, 2014, 02:10:04 AM
From discussions in the other thread it seems that my hopes of @Keywords being able to make use of my existing IMatch 3.6 categories easily was premature, and it looks like the only way a 3.6 user who has used the excellent categories features can now move to using the powerful IMatch 5 @Keywords feature is through extremely complicated third-party scripts or perhaps through the program's metadata templates.

There is a much simpler way.

1.  Left click the parent category of your regular IMatch categories, then right click > clipboard > copy category
2.  Left click @keywords to select it then create a subcategory under it.
3.  Left click that newly created @keywords subcategory then right click and choose clipboard > paste category.

Clean up by drag and dropping as necessary then once you're satisfied delete your regular categories.

Richard

Hi Lane,

In addition to the Keyword field there are about two dozen other fields in IPTC and I make use of all of them. An entry in keywords like: Location|USA|North Dakota|Fargo should make sense to a reader. But the Caption and Special Instructions fields allow 2,000 characters each so I can't picture the information in those fields in Keywords. All of the IPTC fields are in XMP but I don't believe that I want all of it in XMP. I intend to use Categories and Attributes to store data and I hope that, as needed, I will be able to add information to XMP from Categories and Attributes.

Something else that I will do is bring all of my documentation into IMatch so that I can assign say PDF or TXT files to a person's category. If something can be digitized it will likely end up being managed in IMatch. I know that I can associate things many ways using Categories but I don't yet know that @Keyword categories can do all that I want. I fear that @Keyword categories would limit me in some areas but really help me in others. Not all XMP applications handle hierarchical keywords so I need more than Keyword categories. It is going to take me awhile to decide what I can do, what I want to do, and how to make it all happen the best way in IMatch 5. Once I have everything the way I want it in IMatch, only then will I again concern myself with how to share my work with others. I already know that to get the most out of my genealogy files one should buy an IMatch license.

Richard



There is a much simpler way.

Hi John,

I can't picture myself doing your steps for 25,000+ categories.  :'(

JohnZeman

Howdy Richard. 
I did it for over 45,000 images in about 2500 categories, no problem.  As I recall it only took a few seconds if even that.  What did take quite a while was writing all those changes to the individual files.

Richard

Hi John,

I was thinking one category at a time.  :-[
I can see where it could be easy if a user has just  one or a few parent categories.

DigPeter

Hi Lane,
It is really not difficult to use Ferdinand's script to write categories in IM3 to XMP hierarchical keywords.  I have done it and I am a computer illiterate.  You just have to do your homework first, which you are probably going to have to do anyway when setting up the IM5 database.  Decide which of your existing categories you want as @Keywords, then invoke the script.  The conversion process will take time, but with your mighty machine, not too long.  But I would urge anyone who goes this route to combine it with the thesaurus.  If you do all this before setting up in IM5, I am sure it will be less trouble than doing it within IM5 by any of the alternative methods.