Fetching Images (Files) From A List

Started by Darius1968, May 02, 2017, 02:35:32 AM

Previous topic - Next topic

Darius1968

I've been examining this thread with some interest, and it seems to be along the lines of what I need, but no cigar!
https://www.photools.com/community/index.php?topic=4703.msg31865#msg31865

More specifically, I have a list of files in the Windows Notepad, which I could save to a text file.  It has the full file path of a bunch of files that I know are in my database.  Is there a script available that would go through this list, and put the files in a category? 

Thanks! 

Mario

No. This is such a specific requirement (create a category from file names contained in a text file) that this will require a custom script. Not hard to do, but not built in.

ubacher

I have a script which will serve you as a base. It takes a list of files and finds them in the database and gives them a label.

It is specific to my situation (unique file names in my db) but you can change the parsing of the file name etc.
I attach it.


Mario

This would also serve as a great "My First IMatch 2017" app example. Or, when you have IMatch Anywhere, you can write the script already  ;)

There is an endpoint in IMWS that allows you to fetch data for a file by path.
So the script would need to do this:

Read the text file.
For each row, lookup the file data in IMWS. Put the file id into an array.
When done, assign all files in the array to the category.

Ger

Maybe you can have a look at this script. I use it regularly.
The text file contains an image (full path and name) per line; all images found will be added to a category you can define after program start.

(Happy to get a version of this script in javascript as well!)

Ger

Mario

#5
Quote from: Ger on May 07, 2017, 01:46:46 PM
(Happy to get a version of this script in javascript as well!)
Ger

Thanks for sharing.

To find files by name in IMatch/IMWS you can use the /files endpoint. It supports a path parameter which allows you to request information about a file by the file name. The id returned can be used to call the endpoint which adds files to a category.

Alternatively, the special /files/lookup endpoint has been designed to find sets of file names very quickly. I use this in some specialized import apps to quickly check whether or not the files in a database table exist in IMatch.

The files/lookup endpoint takes an array (a list) of file names. It returns an array with the id of each found file, or 0 when the file was not found. Several thousand file names can be processed in a few milliseconds with this endpoint.

This is a small sample that shows how you would use this:

1. You store all your file names in an array.

2. You send this array to IMWS (remote or embedded in IMatch)

3. IMWS responds and sends back an array with either the id or 0 for each file. You can then use these ids to add the files to a result window or a category or whatever. The example just displays the ids in the browser console.


Ger

Thanks Mario...

I have about six or seven scripts I still use and want to convert so I have to learn some javascripting...

Ger

Mario

Learning a bit about JavaScript and HTML will help you in many ways, not only for IMatch...

And, remember: Apps can be written to work with both a local IMatch for Windows installation and a remote IMatch Anywhere WebService.

Ger

Ja ja...

A lot of water will have been flowing through the Rhine (and any other) river to the sea before my programming knowledge is up to speed. I still have to start with IMatch Anywhere. I'm afraid it will take till IMatch 2017 is available...

First tryouts will be Mario's demo apps (lots of I hope) to learn and steal with pride

Ger

Mario

#9
I ship over 20 samples with IMatch 2017. Plus about 10 more or less complex apps, to show some real life programing. Plenty ot stuff to study.

The (almost) final documentation for IMWS and IMatch, the class libraries, tutorials, recipes etc. are already on-line and available here:

https://www.photools.com/developer-center/

Remembering this thread while looking for a good example to demonstrate how to work with idlists, I've implemented a sample app named File Finder. It combines the old "File Name Digit Match" used by many with the "search for a list of file names" requested here.

The key aspect of this script is to demonstrate how to iterate over all files in the database using an idlist and in batches. It makes no sense to "download" data for 400,000 files in one go. This is a massive amount of data, the user may switch to another app etc...

The idlists concept I've implemented for IMWS allows you to tell IMWS: "Give me data for the files in this idlist. Use a page size of 50 and give me the files on page 10". This means that if the idlist contains 1000 files, it has 20 pages with 50 files each. By requesting page by page, you can iterate over any number of files, processing them in convenient batches.

This may sound complicated, but it is really easy. And makes your script run independent from how many files there are to process.

The File Finder app always searches all files in the database. On my system, it searches 10,000 files in less than 1 second! Which is pretty good. IMatch 2017 scripting is really faaast. For a 380,000 files database, it takes about 5-8s to search all files.

This is how the app looks:


Ger

That's where I definitely see advantages of the new scripting engine:

QuoteThe File Finder app always searches all files in the database. On my system, it searches 10,000 files in less than 1 second! Which is pretty good. IMatch 2017 scripting is really faaast. For a 380,000 files database, it takes about 5-8s to search all files.

My old file name match script will easily take minutes to find matches for a limited set of source images. I'm still looking forward to the new version (although I think I will have the old 5.8.4 version running next to it for quite some time... at least until I have my basic workflow scripts converted :)

I will be studying the samples and trying to steal with pride!

Ger

Mario

#11
'Borrowing' code is normal in programming and a proper developing practice.

The samples I use use 3rd party libraries like jQuery for working with the HTML, the Twitter Bootstrap classes for an responsive cross-device user interface, the FontAwesome web font for icons and stuff (was an early backer on for the new version on Kockstarter). These are all standard JavaScript libraries written by smart people, and free to use. They save a lot of time and allow me to concentrate on the "IMatch part", which is where the music plays. Writing this script helped me to identify two bugs - good!

The final version of the File Finder App performs superbly even for large databases. Searching the matches for 10 source files in 380,000 total files takes less than 5 seconds on my PC. And it even looks neat. As so often, 90% of the code is the UI, only 10% is for the search functionality. If you only need one button and you can hard-code everything you can write it in maybe 20 lines of code.

I simplified it a bit by always presenting the result in a result window. From there the user can assign the files to a category or whatever quickly.

This app can serve as the 'steal from' base for all apps which need to search or process all files in the database in some way.

I even designed a slick icon for it  ;)


Mario

While testing the specific IMatch / IMWS endpoints used for the file finder (which was the actual reason to create it) I've added an option to choose where to search (database, active file window). This makes sense in many situations, e.g. when you want to identify specific files in large categories or all files matching in a specific year.

I've also added a 3rd search mode (Similarity Search) which searches for file names by the so-called Levenshtein distance. I think I had something similar in a pre 3 version of IMatch or even in the 3...

This search mode allows you to find similar file names, file names with a typo, file names which differ from the source file names by 0, 1, 2, ... characters etc. Very useful, helped me today already. The problem was to find an implementation that works for realistic file sets. The first implementation I tried took over a minute for 200,000 files. Ughs. The finally implemented solution takes about 8 seconds for 380,000 files. That's OK in my book.