imatch - broaden scope ?

Started by Dp-4711, June 20, 2024, 04:16:12 PM

Previous topic - Next topic

Dp-4711

Hi,

using imatch for maintain a picture catalogue is great, searching spots by map feature or persons by face recognition feature, just to name a few ones.

I am thinking about, if imatch would be a good solution to maintain a catalogue of non-picture documents as well. It is merely about scans and pdf-ed office documents. The crawler needs to maintain a catalogue of document attributes (pdf), like date created, author, etc. and - now the magic - of content by OCR'ing the words in the document / scans.

I'm just playing a bit with paperless-ngx, but this is a big gun. You need a performant LINUX server at home or hosted VM to get acceptable performance.
The charme of imatch is its capability to execute on client hardware with good response times.

Any hints / ideas / suggestions are very welcomed.

Mario

QuoteIt is merely about scans and pdf-ed office documents. The crawler needs to maintain a catalogue of document attributes (pdf), like date created, author, etc. and - now the magic - of content by OCR'ing the words in the document / scans.
IMatch can index PDF files and will extract all available native PDF metadata and XMP like author, title, keywords automatically. When you modify XMP in IMatch, IMatch writes it back to PDF and maps what is possible to native PDF metadata.
IMatch also extracts metadata from Office documents and similar.
That works well enough for most users I guess.

IMatch does no OCR. I have not found a usable OCR software with a permissive license I could include it in IMatch. At least none with a substantial up-front payment + annual or per IMatch license sold royalties.

The AI tests we're currently running in the invite-only AI playground board makes some promises. If prompted correctly, the AI can extract the visible text on the thumbnail/preview IMatch has produced by PDF and Office documents. With typical failure rates. But this is not full PDF OCRing.

IMatch would not only need OCR tools but also other infrastructure, unless the extracted text is stored in a text Attribute. But that would make searching a mess and not that useful. Attributes are not designed to hold page over pages of text.

So a dedicated storage place would be needed in the database, which stored full text in a way than supports quick searching + display + variable access + ...

There is a reason why there are dedicated document management systems. They differ quite a lot from DAMs with a focus on images and video files.

rienvanham

#2
Take a look at "Everything 1.5". Still in alpha but it works perfect and very fast. I scan (and OCR - with Abbyy Finereader) all my documents and can find every document with a blink of the eye.

search-pattern example:
ext:pdf content:"just a notification"

finds all pdf's with the text "just a notification" in the textlayer.

Mario

I'll move this into the feature request board. Seems to fit there better.

philburton

For OCR (and PDF editing) I use Nitro Pro.  Purchase, not subscription like Adobe PDF.  https://www.gonitro.com/

Dp-4711

Quote from: philburton on June 21, 2024, 01:23:15 AMFor OCR (and PDF editing) I use Nitro Pro.  Purchase, not subscription like Adobe PDF.  https://www.gonitro.com/
thanks, yes OCRing can be done separately (I'm using PDFelement, same like Nitro). The magic is building the index, compare with MS-WIN indexer (named windows search previously). Mario hit the point, this does not fit into the attributes space of a document.

The charm is the crawler of imatch, throw a directory into imatch and it works on that data without any need of uploading or pre-processing.

I don't know "Everything 1.5", will take a look into (thanks for hint) ...

Mario

QuoteMario hit the point, this does not fit into the attributes space of a document.

I'm not adverse to extending IMatch to make it more useful for all types of users and requirements.
Do that all time time.

Implementing OCR support, add proper full-text indexing and storage to the IMatch database to support searches across text extracted from PDF files and maybe Office documents will take a lot of time and thus be 'expensive'.

From what I know, this kind of PDF handling is not what more than a handful of IMatch users will ever need. I might be wrong.
Spending a couple of weeks or months on something that is beneficial for the majority of users or a substantial number of users is one thing. Spending months for something niche like PDF OCR is another thing.

We'll see how many likes your feature request gets.

philburton

Quote from: Mario on June 21, 2024, 02:33:13 PM
QuoteMario hit the point, this does not fit into the attributes space of a document.

I'm not adverse to extending IMatch to make it more useful for all types of users and requirements.
Do that all time time.

Implementing OCR support, add proper full-text indexing and storage to the IMatch database to support searches across text extracted from PDF files and maybe Office documents will take a lot of time and thus be 'expensive'.

From what I know, this kind of PDF handling is not what more than a handful of IMatch users will ever need. I might be wrong.
Spending a couple of weeks or months on something that is beneficial for the majority of users or a substantial number of users is one thing. Spending months for something niche like PDF OCR is another thing.

We'll see how many likes your feature request gets.


Mario,

My concern is that something like OCR is a bit of "feature creep" and not related to the core mission of iMatch.  Compared with other OCR products, what would be your "competitive advantage?"  I'm not saying that there isn't any, but getting a proper decision with enough information is both time-consuming and perhaps outside of your core skill set.

When I was still working as a product management consulting, I worked with a Swiss client who had an interesting idea for an Outlook enhancement, and the results was that almost everyone wanted this feature, so he spent a lot time and resources to develop that enhancement. When no one purchased that enhancement, he approached the consulting company I worked for, to get help.  I quickly discovered that the surveyed the wrong audience.  His survey focused on network and system administrators, over 100 in total, who were several levels removed in the organization from the senior management who actually made purchase decisions.  And those people were not interested, because they thought this enhancement didn't solve a real problem.

Morale of this story:  It's harder to do right than it seems.

Mario

Yes. That's why I ask the actual IMatch users about what features they want, which enhancements to existing features would be helpful etc. All here in the feature request board, out in the open.

There is no harm in asking for something you find useful. Just a quick post.
Other users can comment on it and like it. Which then tells me if something is just for one user, a "nice to have" for some users or "must have" for many.

Most feature requests don't get likes.
But then, only a portion of the community members, which is only a portion of the IMatch user base, reads feature requests...

philburton

Quote from: Mario on June 22, 2024, 09:46:23 AMYes. That's why I ask the actual IMatch users about what features they want, which enhancements to existing features would be helpful etc. All here in the feature request board, out in the open.
.
But then, only a portion of the community members, which is only a portion of the IMatch user base, reads feature requests...

Then there is the issue of "sample bias."  Do the people who respond, are they truly representative of the user base?  OR, are they the most "committed," the "most intense users," whose needs and interests are different than that of the entire user base.

My suggestion, and it is just that, is to put in a "polling" feature in iMatch.  Whenever you have to make a major decision, iMatch will display a question that you put out as a poll, with reply options or a 1-5 scale for need.  You are likely to get a broader response, which you can just by the number of replies vs. the user population number.  Again, just a suggestion.

rolandgifford

Quote from: philburton on June 22, 2024, 08:50:43 PMMy suggestion, and it is just that, is to put in a "polling" feature in iMatch.  Whenever you have to make a major decision, iMatch will display a question that you put out as a poll, with reply options or a 1-5 scale for need.  You are likely to get a broader response, which you can just by the number of replies vs. the user population number.  Again, just a suggestion.

False/wrong answer from that as well.

Before upgrading to V2023 the new feature I was most excited about was AI labelling, without privacy issues. It didn't take me long to find that it isn't actually of any use to me as 'general' labels of of little interest so no point spending any time on training.

Mario being convinced in his own head that something is worth spending time on is all that is needed. That sometimes comes from feature requests and sometimes comes from 'How do I do' types of question which trigger a more general improvement.

I expect that telemetry helps far more than a poll.

sinus



Without thinking a lot on this, it looks at least interesting to me. 
With telemetry on you forget it and you are not more really aware on it (at least I do so).

With filling out a poll (what should not be often), I would have, I think, the feeling, that my answer will be heard. 
And that would give me a positive feeling. 
Best wishes from Switzerland! :-)
Markus

Mario

Quote from: rolandgifford on June 22, 2024, 11:09:43 PMBefore upgrading to V2023 the new feature I was most excited about was AI labelling, without privacy issues. It didn't take me long to find that it isn't actually of any use to me as 'general' labels of of little interest so no point spending any time on training.
This will always be the case. Unless AI evolves enough to make it easy to "configure" it for specific uses.
Did you see this post: https://www.photools.com/community/index.php/topic,14296.msg ?

It utilizes an open source software than runs AI models on your local PC. I have written an integration for it for testing purposes and several users and I work with that to see if it is any good or useful for IMatch users.

rolandgifford

Quote from: Mario on June 23, 2024, 10:51:42 AMThis will always be the case. Unless AI evolves enough to make it easy to "configure" it for specific uses.
Did you see this post: https://www.photools.com/community/index.php/topic,14296.msg ?

I did see that post

My comment was intended to be to illustrate the unreliability of polls, with an example case where my answer to the poll would have been wrong.

I wasn't intending to criticise the AI labelling, which I expect will continue to be inappropriate to my needs. Labelling aspects I don't label manually doesn't improve the quality of my data, I had to try that to see it. I'm sure it will be excellent for others who take different types of photos, or use them differently to me.

Mario

In my experience, users (not only those using digital asset management software) typically don't know what they want or, more
importantly, what they actually need – which often differs from their initial requirements.
This is a significant reason why many software projects fail to deliver on time or within budget.

Approaches like rapid prototyping, Agile development, and minimum viable product (MVP) strategies attempt to involve users directly and early in the process. While these methods can be effective at times, they are not foolproof and may not always yield the desired results.

It is challenging for "normal" people to envision how a feature will look or function when it currently only exists as an idea in the software architect's or developer's mind. Especially when it's not the user's job and only involves a software they use more or less frequently, like IMatch.

Asking in a form of a poll that pops up in IMatch when all a user wants is to start working with her/his photos from the last vacation will not provide any useful result, as you state correctly. Requests and discussions with questions and answers in this feature request board work better, IMHO. A good measure for me is e.g. the views vs. Likes rate and how many users commented.

My typical approach when introducing new features is to use the MVP methodology.
I create a comprehensive plan, build a solid foundation, and then decide which elements to include in the initial release. This allows users to start working with the feature and provide feedback on what's missing or needs improvement. If enough users request enhancements or changes, I'll implement them.

I consider adding other elements from my initial plan when I believe they will benefit a significant number of users. And, usually, some elements of my initial plan never get implemented. Because sometimes things are just finished.
With the constant demands on my time, it's often enough to have something good enough.

philburton

Quote from: Mario on June 23, 2024, 02:42:20 PMIn my experience, users (not only those using digital asset management software) typically don't know what they want or, more
importantly, what they actually need
Trust me, that issue aplies more broadly.  People ask for features, instead of describing their problem.

QuoteApproaches like rapid prototyping, Agile development, and minimum viable product (MVP) strategies attempt to involve users directly and early in the process. While these methods can be effective at times, they are not foolproof and may not always yield the desired results.

How do you do an Agile approach, when photools is a company of one?  Who is the Product Owner?  The Scrum Master?

QuoteMy typical approach when introducing new features is to use the MVP methodology.


Who helps define the Minimum Viable Product? 
QuoteWith the constant demands on my time, it's often enough to have something good enough.

Which is the reality.  As long as it achieves the MVP.

Jingo

As both an independent and corporate developer/project manager, it is often very easy to undertake an "Agile" approach to software development without always employing the complete "formal/official" meanings.

At work, we do employ scrums, owners, champions and fully embrace a Jira/Confluence/SVN workflow.

For my personal work, I tend to be a bit more lose with things... still utilizing "Agile" approaches for product control but tend to forgo "scrums" and such... since it is just me! I use Agile in this case to mean utilizing production tools and software to quickly go from concept to product in a short and easy manner.  JS, Angular, SQL and Chrome allow me to do just that.  

philburton

Quote from: rolandgifford on June 22, 2024, 11:09:43 PM
Quote from: philburton on June 22, 2024, 08:50:43 PMMy suggestion, and it is just that, is to put in a "polling" feature in iMatch.  Whenever you have to make a major decision, iMatch will display a question that you put out as a poll, with reply options or a 1-5 scale for need.  You are likely to get a broader response, which you can just by the number of replies vs. the user population number.  Again, just a suggestion.

False/wrong answer from that as well.

Before upgrading to V2023 the new feature I was most excited about was AI labelling, without privacy issues. It didn't take me long to find that it isn't actually of any use to me as 'general' labels of of little interest so no point spending any time on training.

Mario being convinced in his own head that something is worth spending time on is all that is needed.

Uh, Mario could be an absolute rock-star, which I think he is.  But unless he "eats his own dogfood," at some point he may not be able to imagine all that custoemrs want.  I agree that customers may not always know what they want, that sometimes they are attracted to the latest "shiny object."  Point is, there is no perfect answer to this question.

Relying on just your customer's inputs leads to the Innovator's Dilemma.  As a (retired) product management consultant and before that a regular employee, I've seen that happen.  To learn more than you perhaps need to know, see https://en.wikipedia.org/wiki/The_Innovator%27s_Dilemma

QuoteThat sometimes comes from feature requests and sometimes comes from 'How do I do' types of question which trigger a more general improvement.
Absolutely agree.  What your customers can't do (easily) with your product is priceless feedback.
QuoteI expect that telemetry helps far more than a poll.
Using one does not preclude using the other.  Of course telemetry is very useful for identifiying areas for optimization or improvement.  But it doesn't identify missing features.

Mario

#18
QuoteAbsolutely agree.  What your customers can't do (easily) with your product is priceless feedback.

I have not seen a software which polls the user for feedback, at least not in a way that not gets in the way.
How would such a poll look like?

A popup in IMatch asking the user if she/he is satisfied with the product?
Which feature they like most?
If they have any suggestions or feature request? With a link to this board or another feedback channel?
...

I happily do what's good for users. Not sure if a possibly "getting in the way" poll prompt is the right way.

We're so used to constant nag screens like:

"Let's finish your Windows setup" (and link your Windows installation forever to a Microsoft account we can monetize)

or

"Enable backup of your photos so your smart phone loads them into the cloud we control" (so we can monetize your images)

and similar.

Most people are just annoyed or even angered by this.
Not sure if adding such a nag/poll screen to IMatch will do any good.

But I'm neither a rock star nor all-knowing. And always open to suggestions.
Now I have to go back and fix a bug.

Ps.: If somebody knows somebody in marketing, I would be grateful. It is almost impossible these days to get the word out without substantial financial investments. Even web sites depending on releasing information about new products or writing reviews don't react or want to see some up-front payment to even consider you...