Import just PDF or text files?

Started by MyMatch, August 28, 2016, 08:14:47 PM

Previous topic - Next topic

MyMatch

In IMatch 3, i had several database for documents and scripts, which i perfectly could build with the file-type selection when adding new folders.

In IMatch 5.5, this seems to be gone, it imports anything that it want to import - and so, i end up having lots of different things in the same database - which i sure don´t want.

Is there some way to force IMatch *just* to import *.pdf files?
Or "generic" text files as for my other documentation, html pages and php scripts?

If this is not possible, i will loose this fantastic use for IMatch - or will need to continue with 3.6

Thank you.

Mario

IMatch by default manages all files in the folders you index in your database.
It is not advisable to mix managed and un-managed files in folders you use in your IMatch database.

If what you 'see' in IMatch does not match what's actually in the folder (e.g. you index only PDF files in IMatch, but the folder also contains images,  Office documents and whatnot) there is a danger of

- data loss:  (when you delete the folder in IMatch, not knowing that there are other files as well)
- irritating behavior: (when you rename or copy/move files and you get warnings about existing files which you cannot see in IMatch)
- issues with versioning and propagation

I doubt very much that many users will ever see the need to only index _some_ files in the folders they manage in IMatch. It's always better to keep files you don't want to manage in IMatch outside of the folders you manage in IMatch. Really.

Since this is IMatch, there is of course a way: Edit > Preferences > File Formats allows you to do what you want.

Remember: IMatch is not a file viewer, it is a Digitial Asset Management system. By mixing managed and unmanaged files in the same folder, you are working against the system. Better re-consider your file layout and keep files you manage in your DAM separate from files you don't want to manage in your DAM.

PS.: As alwass: Press <F1> while the dialog is open to open the corresponding help topic. Or just type file format into the IMatch help system index to find all related info...

MyMatch

Thanks a bunch!
I will try the preferences setting!

I seem to use IMatch in a totaly other way then others :D

For example, i have IMatch Databases, that scan my complete PC with all disks for this or that filetype - so, i use IMatch as some kind of search engine!

This way, i have several IMatch database that "document" all my content on all disks.
But that only works, it i can select what type(s) of documents to add to a certain database.
This way i can easy find things that i would otherwise never find again!

Of course your arguments against this, do not apply - as i know very well that of the things in those folders are not part of this or that or any IMatch database. That´s the normal case at me.

I just like to search of PDFs, Text-Files, Scripts, Programming Work, other documents, Images, Movies, Audio / Music on my many, many disks and add them to my IMatch database.

My main database for Images has now about 320000 images, and i have some databases more for images.
IMatch is great as such a "generic search and management" engine for files!

Don´t loose it usability for such things, please!
I am just testing IMatch 5.5 ... :D

MyMatch

Let me add:

I have nearly no use for all those MeTaData and EXIF information and so on.

I would very much like to be able to only add such data optionally - at a later time.

For me, it is just important to see the folder structure on my disks and the categories i used my my file, as well as the rating:

o Locations on disks and filesystem
o Categories
o Ratings (and maybe Labels)

That´s it.

For any and all file types.

Great thing.


Mario

Metadata is the core of IMatch, like with any other DAM system.

Many features, from the Timeline View to the Metadata and Keywords Panel, data-driven categories, the Map Panel, search and filter functionality and dozens of other features depend on the availability of metadata. IMatch already offers you features to skip and trim rarely needed or not-intended-for-humnans metadata, but importing standard metadata from files is a core feature of the file import. Even to know how to rotate your files for display requires EXIF data.

If you want rating and label, you work with XMP metadata. To produce XMP metadata for your files, IMatch has to import and map EXIF, GPS and legacy IPTC data during the import, which means it has to import this data.

MyMatch

Now, Rating and Label *may* be XMP data, but for me they are just entries in a tree of categories and fully seperate to the images.
There is not even a need to write them outside the database - and if such a need comes, i would only write them into sidecar XML files to not touch the images itself.

Anyway ;-)

sinus

Quote from: MyMatch on August 31, 2016, 01:51:17 AM
I seem to use IMatch in a totaly other way then others :D


Yep, that is for sure. A bit strange for me.  8)

But then work only with categories (you can create very good a label / rating - system with cats) and/or attributes.
And make your files not writeable, to be sure.

And do not let IMatch write back to the file or use a xmp-file.

Though this is against the system.
Best wishes from Switzerland! :-)
Markus

Mario

Quote from: MyMatch on August 31, 2016, 10:17:45 PM
Now, Rating and Label *may* be XMP data, but for me they are just entries in a tree of categories and fully seperate to the images.
There is not even a need to write them outside the database - and if such a need comes, i would only write them into sidecar XML files to not touch the images itself.

Anyway ;-)

There are very good reasons for standardized ratings and labels, and for storing them in the XML record. The primary reason is data exchange with other application. If you use more software than IMatch you should immediately see the benefit of seeing ratings/labels assigned in IMatch in other applications (Photoshop, Windows Explorer, Lr, your RAW processor if you use one, your favorite photo web site etc.).

There is also a standard where to write XMP metadata. For file formats like JPEG, PNG, TIFF, PSD, DNG etc. XMP must be embedded in the file. For proprietary RAW formats, XMP data must be held in a separate sidecar file. All that is industry standard and based on the specifications and recommendations of the Metadata Working Group. IMatch fully supports this standard.

Even if the XMP data is held in sidecar files, it has copies of IPTC and EXIF metadata contained in the original image file. And if you make changes to EXIF or IPTC metadata, or to XMP metadata that has to be synchronized with EXIF/IPTC/GPS metadata contained in your image files, IMatch will have to write to both the external sidecar file and the embedded metadata.

After almost four years and probably millions of images written that way, I can assure you that this is safe. IMatch relies on ExifTool for a reason.

MyMatch

To add this ...

I realy have no interest in changing metadata.
Neither EXIF, nor IPTC or whatever.

And most important, i don´t want my image files to be changed at all!
They are "golden" in the meaning that they shall never be changed for whatever reason.
Their checksum shall never change.

When i need to export or import something like Ratings, only XMP sidecar files shall be used and created.

And as i understood now, this means that i need to change the behavior of 2 to 4 settings for 30+ file extensions in the "File Format Metadata Options" which is a bit horrifying for me :-O

sinus

If it is that imortant to you, that your files are not touched (what I can not fully understand), then I would do some tests, if I were you.

For example:

Make copies from several files.
These copies are in a folder, not indexed by IMatch.

Then do some steps in your workflow with IMatch. Steps, where you are not sure, if they do touch your files.
Write the steps carfully, so that you know, what you have done.

Then, after such steps, compare the files in IMatch with them in the separate folder.
If the files are binary equal, you know, that these steps are sure.
If the files are not equal, you know, what steps does touch and change your files.

And after all, why not simply set your precious files to "read only"?



Best wishes from Switzerland! :-)
Markus

Erik

It sounds like the original poster has a workflow figured out.

My only thought would be trying to work everything with one database. You could create data driven categories for file extensions and even group those by types, potentially allowing some larger categories or filters to act as a mechanism for dividing up a complete database. That would seem a bit easier than tracking multiple databases for different files. It would mean one set of settings, which would be easier than what you might face.

Just a thought. I do this to manage video, audio, and photos within one database, but separately for categories and keywords.