Problem with CSV Import

Started by loweskid, October 03, 2019, 05:56:51 PM

Previous topic - Next topic

loweskid

I've been importing metadata for several thousand files that I had submitted to an agency (Alamy).  For a time I only used their online tools for captions and keywords so I didn't have the metadata in my own copies here.  Alamy have now provided an option to download a spreadsheet with all the data so I decided to make use of it.  To start with, to make sure everything was working correctly, I first imported just the Alamy reference number into the 'Job ID' field.  All my images are in seperate folders depending on subject so I started at the smallest and worked my way up each folder in turn to the largest, which has over 7000 files.  This went like a dream, a total of over 23,000 images - no problem.  I then set about importing the Captions (Description) and Keywords.  Again everything worked perfectly up until the very last folder.  Most of these already had captions and keywords so I'm down to just over 2000 files but now the CSV Importer has decided it doesn't want to play anymore - I keep getting a message that the files are not found in the database.

To simplify things - I've put three files in a 'temp' folder.... F:\Alamy\temp\   and I'm just importing the caption.

The CSV file contains (screenshot from Notepad++ included on attached image)...

"Filename","Caption"
"F:\Alamy\temp\man-1643.jpg","caption1"
"F:\Alamy\temp\man-1642.jpg","caption2"
"F:\Alamy\temp\man-1641.jpg","caption3"

As you can see from the screenshot it's reporting it can't find the files. I just can't figure out why - bearing in mind that I've already sucessfully imported the Alamy reference number into these files.

Can anyone spot something I'm missing?

PS - the two icons at top right indicate that the Description and Keyword fields are empty.  As soon as any data is entered they disappear.

Mario

Hard to tell without your database.
I've made a quick test using this file and it worked.


"FileName","Caption"
"D:\data\TEST\samples\_DSC890351.jpg","Lara (E&W Models) for DC group."


and it imported correctly.

Must be something in the input file.
Make a copy, remove all rows but the first two and import from that copy.
if it does not work, tinker with the file contents until it does.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

ColinIM

I see that the 'Caption' tick box is not ticked.

Is that possibly related?

I'm wondering if the "files not found" message is perhaps a convoluted response from the import dialogue when it sees that it has "nothing to do ...", due to the Caption import being DE-selected???

Sorry I haven't ever used this feature and I haven't tested my theory .... I just wondered about the logic of having that Caption box un-ticked.

thrinn

#3
The check box shows which one of the different columns contains the file name. So it is correct that only one column is ticked. In fact, if you tick one column, all others are resetted.
I think it must be something else.
Thorsten
Win 10 / 64, IMatch 2018, IMA

loweskid

Thanks for the comments.  I think I may have found the problem - it seems to be related to having a hyphen in the file name.  I removed the hypens and it worked perfectly.  I've only tried it with the three files so far - I'll try it with a few more then if it works I'll go for the whole 2,000+ files.  I'll let you know.

The mystery is why this problem didn't show up when I imported the Alamy reference numbers.

loweskid

Oh well, I spoke to soon.  I tried it with another 40 files and I'm getting a different error message now.....

Import failed with 'Bad Request{ "error":{ "code":1102, "message":"Invalid parameter.", "details":{ parameter":"tag.value",  "description":"value is missing, empty or invald." } } }'

But I've already used the same setup to successfully import to thousands of files.  Anyway, I've had enough for today - I'll see if I can figure it out tomorrow.

thrinn

If it worked with the Reference numbers, maybe the problem lies in the content of the caption or description columns? I do not know Alamy but I suppose a reference number is of a well-defined format, does not include special characters etc. But a caption/description may contain anything. So this might be a possible reason why the reference number works, but caption/description do now.

From my experience (not related to IMatch) CSV files often give headaches when used with non-structured text fields. So maybe it is worth to check:

  • Do the caption/description fields contain characters also used as delimiters? Especially commas, semicolons, or quotes?
  • Any other special characters which might need "escaping"? I think the CSV importer should take care of that, but who knows?
Thorsten
Win 10 / 64, IMatch 2018, IMA

Mario

#7
Show us a small sample file which fails to import. Try to boil it down to one or two rows. I can use that for debugging the app.
Usually the CSV import has no troubles, unless the quoted ("") text fields contain unquoted quotes, or the file contains is saved in a local encoding using a local code page and not in UTF-8 or UNICODE format.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

loweskid

I tried again this morning with the same 40 files and got the same error message.  I then tried 5 at a time until I got the error message, then whittled it down to one file (of course, it had to be the last one!).  The source of the problem was two commas in one place as a delimiter.

I then did a 'find and replace' on the main CSV file and it found just one other instance.  I then ran the Importer on the remaining 2,107 files and it worked perfectly.

This doesn't explain the original problem where it reported 'files not found'.  I tried to reproduce this with those same three files and it's working perfectly today with the hyphens back in the file names.  I'll do some more digging and go back to basics from the original excel file - I'll let you know if I'm able to reproduce it. 

Thanks once again for the help and suggestions.


Mario

The syntax of comma-separated (CSV) files is very strict. There is no room for extra commas here and there etc. Maybe Alamy should check their export routines?
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook