RFC: Quality Management...

Started by Mario, July 15, 2015, 10:49:04 AM

Previous topic - Next topic

Mario

Do you perform some level of quality management for the digital assets you manage in IMatch?

For example, do you check for missing/incomplete metadata? Or missing/incomplete/insufficient keywords? Minimum resolution? Availability of versions? Unnecessary duplicates (e.g. versions of a master in to many resolutions)?

Such requirements are not uncommon in professional DAM scenarios and can be (and have been) implemented as an App or a script in IMatch easy enough. But I wonder if 'home' users also ensure a minimum quality of their assets and if, which tools or features in IMatch you use.



jelvers

Actually no. But looking at your list, I really wonder whether I should do that once in a while. It seems to be more than "nice to have".

Regards, Juergen

herman

I don't use keywords, only categories.

Your 'File verifier' script is an essential tool for me.

I use some 'Formula categories' to show files with incomplete categorization as well as orphaned versions.
Currently I am looking into a way to show files which do have a raw processor sidecar but no version (meaning that I prepared something in the raw converter but did not yet export the version from the raw converter).


Enjoy!

Herman.

Mario

#3
QuoteCurrently I am looking into a way to show files which do have a raw processor sidecar but no version
Can be done using a script, e.g. via the Relation* classes. These give you access to versions, buddy files etc (including the XMP buddy file) and you can thus easily find out which files have no XMP file yet.

it sounds a bit error-prone to me when you have a workflow which requires you to manually export metadata from some other application. This may cause data loss when you also add / edit (probably outdated) metadata in IMatch and then cannot synch it with the other application properly. If you work with multiple applications which manipulate and rely on XMP data, it is paramount to synch changes to the XMP file on disk automatically. I'm sure you have reasons for your workflow, though.

JohnZeman

I use formula categories to ensure I have the basic categories assigned to each image.  For me since I use the basic 5 W system (Who, what, where, when, and why) every photo in my database must have at least a when and where assigned since every photo was taken in some location at some specific time.  Screen shot below.


[attachment deleted by admin]

herman

Quote from: Mario on July 15, 2015, 02:15:14 PMit sounds a bit error-prone to me when you have a workflow which requires you to manually export metadata from some other application.
That is not what I am doing.
Some raw converters (e.g. DxO, ASP) store their 'processing parameters' in a sidecar file.
I just want to check the existence of such a sidecar for the out-of-camera-original and, if the sidecar exists, there should be a version of this out-of-camera-original image.
If the version does not exist it is an indication that my post-processing is not finished.
Enjoy!

Herman.

Ferdinand

Yes, I do some of this. I have a script that does some checking, and I use data-driven categories to some extent. But like others, I probably don't do enough of this, and if I did more what it would mostly show is that I am way behind in my image management.

You have some new functionality in mind? You want more details on what we currently do?

Mario

Quote from: Ferdinand on July 15, 2015, 04:49:09 PM
You have some new functionality in mind? You want more details on what we currently do?
Nothing specific. Just checking if there is demand for a built-in functionality (App/Script) and what should be covered. Fishing for ideas. I have recently written a script for a commercial user which does some rather deep checks and validations. Very specific. As expected, we found the metadata quality lacking. IMatch was used to consolidate about 200,000 files from almost two decades. Standardization of metadata and finding files without or invalid data was the main purpose. Now a team of people will update the data to make this large collection usable. This will take a while  ;D

The feature set for IMatch 5.5 is complete (and I'm working hard to get the features finished so you can all benefit from it). And I also have some exciting things planned for the releases following the 5.5 release. And from the experiences gathered in the project mentioned above, I wonder how many users really care for the metadata quality. It requires effort to fill in the metadata in a consistent way, to add keywords and the like. But once done, and the data safely stored in your images or sidecars, you will benefit forever - and the generations coming as well.

Many of the tools required for this are already in IMatch (one only has to use them), but maybe the problem affects a sufficiently large number of users to make a more automatic / one-click approach sensible.

Erik

I do and am doing a lot of the above with regard to metadata.

I use data driven and or formula categories to identify files with missing metadata.  I'm still cleaning up files from my conversion over from IMatch 3 where I have some keywords in hierarchy and not in hierarchy, which is a one time clean up using a small script and ExifTool command line routine.

Regardless, I know the one thing I miss (and I know is purposely left out) is the ability to color code formulas based categories.  On the bright side, I'm working around it with some more complicated replacement strategies and labeling with Data Driven Categories.

Carlo Didier

Quote from: JohnZeman on July 15, 2015, 02:38:43 PM
I use formula categories to ensure I have the basic categories assigned to each image.  For me since I use the basic 5 W system (Who, what, where, when, and why) every photo in my database must have at least a when and where assigned since every photo was taken in some location at some specific time.  Screen shot below.
Exactly the same here.

Ferdinand

How much more detail do you want / need? I am in transit for another week, but could supply something more then.

Richard

Like John and Carlos I too make extensive use of the five Ws. New files are automatically added to all five categories. If a file does not fit I remove it.

jch2103

#12
Quality control is one of the key reasons I began using IMatch, and in particular IMatch 5. I learned early on that one doesn't know how bad (inconsistent, incomplete, etc.) one's metadata are until one starts using a first class metadata management tool.

I currently have about 40,000 images, ranging from new digital images to scanned family/genealogy negatives and prints that are well over 100 years old. As a consequence, I also have a range of metadata problems and issues that I'm trying to tackle a bit at a time. The following is an incomplete list of tools/approaches I'm currently using. As others have done, I've made use of IMatch data-driven categories as a primary tool, including color coding specific categories.


  • Missing date. This pertains mostly to scanned images where no original date was available when the scans were made. I used XMP::photoshop\DateCreated\DateCreated\0 for a 'Missing Date' category, using 'Part of the value' and start and length of 1,1. I used the 'Other' element, which I then color coded. This combination makes it easier to find and correct such images.
  • Missing location. I try to assign Country/State/City and sometimes Location data to images. To find files with missing location data, I use Composite\Country\Country\0 in a data-driven category, color coding 'Other', and also adding the other hierarchical location categories ad additional levels.
  • Missing GPS coordinates. Similar to above (i.e., taking advantage of Other). but no color coding (yet).
  • Keywords. I use color coding to keep track of files I've uploaded to Smugmug, using a color coding of all hierarchical keywords beginning with 'Smugmug'. That helps me ID files I have or haven't yet uploaded.
  • File verifier. I've made some test runs of the file verifier script. I need to run this on a larger selection.
  • Collections. When I find an issue or images that require some special attention, I use Collections to temporarily tag photos with one of the subcategories (e.g., Dots, Flags or Pins. After completing work on these files, I delete the subcategory.
  • Ad hoc folder color coding. Similar to above, but where I temporarily color tag a folder I'm working on (e.g., to assign metadata to scanned images), to make it easier to return to it between sessions.
  • File window layout. I use a custom file window layout (thanks again to Marcus for his contributions here) that helps ID various small issues.

This is obviously an incomplete list that really reflects how much work I still need to in cleaning up my metadata! I for one would welcome additional tools to assist in this process.
John

Ger

For me, quality management is one of the key assets of IMatch. If you have x000 images in your database, one of the benefits is finding that particular image or group of images you need. In order to be able to achieve this, your database organization must be up to date: categories and metadata.

For consistent data entry I use the thesaurus where possible (e.g. keywords, geography).
For checking completeness and correctness, I use data driven and formula based categories (e.g. location), color coding, file window display formulas and a few quick scripts.

Ger


mastodon

Missing date, time, place and presumable false GPS coordiates. Findig similar word that should be identical (places, names), because I did not used theasaurus.

Carlo Didier

One thing where I regularly have inconsistent metadata is location information. My GPS coordinates are correct, but geonames often returns maybe correct but unpractical information. This is probably due to the different types of addresses in different countries.
Examples:
"State/Province": often deos not exist, especially in small countries, but geonames still fills in something that sometimes doesn't make much sense.
"City" and "Location": For small villages in the Scottish Highlands, "City" gets filled with information about a municipality or region (which I would rather see under "State/Province") and "Location" with the name of the village.
"Sublocation": often gets the name of a village wheras I would rather put a street address or the name of a building there ...

I'm still debating with myself whether to go through the trouble of manually changeing these things or just let geonames rule (but I have also seen cases where geonames changed the designation of locations from one year to another, which introduces another inconsitency).

Mario

GeoNames,org is a free service, run by volunteers. It is based on information gathered from official sources, government data and data entered by volunteers.. You can support GeoNames.org by updating/extending/correcting their information. If you think that some small villages or areas in Scotland are not mapped properly, you can easily correct the information and this help all other users. See the GeoNames web sites for details.
As you have noticed, geo-data (not only with GeoNames) is always in flux, especially in remote areas or third world countries.

To buy better map data, Google/Bing spend millions of dollars from the money the make from selling advertisement and our personal data. They can afford to buy data from commercial map and data providers. The data quality is usually better than GeoNames.org, but when using Google/Bing you have to accept their rules and privacy options - which means reading screen after screen of legal text and then deciding whether or not it's worth it.

Carlo Didier

I didn't want to critisize geonames.org, just pointing to potential problems regarding the quality of the data, so people know that it isn't a perfectly accurate and authoritative reference.

Mario

I understood that. I just explained things for the benefit of other users (and the community search function) and explained how to improve the GeoNames.org data base.