Developing a strategy for unknown or imprecise dates

Started by dcb, August 04, 2013, 05:14:01 AM

Previous topic - Next topic

dcb

As the good Doctor is known to say, "it's more like a big ball of wibbly wobbly... time-y wimey... stuff." And that's the problem with IMatch 5's new timeline view. It expects the date and time of a photo to be somehow known and precise. This is not the fault of IMatch in any way. It is caused by a set of standards that don't account for imprecise or unknown dates.

For this conversation I will make some simplifications. When I say "inside the photo" take as read I mean the image file or XMP sidecar or something else as appropriate.

A real quick background

If you're unaware, digital cameras store the date and time a photo was taken and store that information inside the photo file itself. When that happens (assuming the clock in the camera is correct) the IMatch 5 timeline works perfectly.

A scanner will use the time the photo is scanned. IMatch 5 will place it in the timeline according to the scan date. If you're expecting the timeline to be a calendar representation of photo dates then you will be disappointed. It's more a timeline of file creation dates.

The four possibilities for photo dates


  • You know the date and time and it is correct. Do nothing.
  • You know the date and time and it is incorrect. Change it using the metadata panel in IMatch. The timeline will adjust accordingly.
  • You know approximately when the photo was taken. You can be quite precise about the date, probably not about the time. Your best guess might be the day or the more imprecise month, season, year, or decade.
  • The date is unknown. You have no idea at all other than knowing it was after the invention of the camera!

Items 3 and 4 are problematic for IMatch. The ISO 8601 standard for dates and times assume what you know is accurate. There are two parts to the problem. The first is recording what you think the date is, and the second flagging that in some way.

A partial solution

A full solution to this is a long way off. We need a standard and then software to follow that standard. My search took me to the world of genealogy where imprecise and unknown dates are common. I found a good article based on the GED standard which uses prefixes for ABT (about), BEF (before), AFT (after) and BET (between) (see http://www.werelate.org/wiki/Help:Date_Conventions). I personally don't like it because it moves away from the YYYY-MM-DD date format. Further searches led me to the Library of Congress Extended Date/Time Format (draft) or EDTF for short. This I do like because it extends on ISO 8601 standard quite well.

Under this approach, photos of a formal school ball that I know were from my late school years can be marked as 1987-11~ (Approximately Nov 1987). A photo from Christmas morning where the year is totally unknown could be uuuu-12-25. If I know it to be in the early 2000's (from memory) I might write 200u-12-25. Then as I honed in on a date it might become 2004-12-25~ (Approx 2004)

Getting this into IMatch

The EDTF standard does not fit the IMatch 5 timeline and I don't expect it to for a long time, if ever. I have tested on my catalog:


  • Place the EDTF date in the comment field (I still want a better field for this)
  • Create a data drive category that reads the comment field - this groups like dated photos together like the timeline and brings all such photos together under the parent category
  • Colour all photos in the category - this marks them throughout IMatch as having a timeline/metadata date that should be ignored in favour of what is in the comment field

Summary

There will never be a right or wrong way to do this. What works for you, works for you. I hope this will help you find what that is.

-David
Have you backed up your photos today?

Richard

QuoteThe date is unknown. You have no idea at all other than knowing it was after the invention of the camera!

I wish! I have image files of scanned documents that were created centuries before the invention of the camera. Some of those resulting images can be dated precisely, the date of the document is shown on the document. Most of the time I have to make a good guess when the document was created. The date the document was photographed or scanned mean nothing.

Entering estimated dates in IMatch has always been a pita but I developed my own work around. I can't imagine how IMatch would deal with "uuuu-12-25" or even a date range like 1790-1800 in the timeline. Something like 1795 ± 5 might work.

Mario

Well, in fact the timeline is able to handle arbitrary groups of files. It's a matter of how the files are processing on import and how the individual nodes in the timeline are created. The current time line has been designed to work for standard dates using in archiving and imaging. For images it uses one of the three EXIF timestamps, for other file formats it uses other date information.

XMP and (with limits) IPTC and EXIF support incomplete dates, e.g. a date consisting only of the year. But supporting this makes things so complicated on so many levels (starting with ExifTool which does not support them) to all standard Windows calendar functions, comparison routines for dates, sorting by dates, all that becomes more complicated or has to be developed from scratch in order to handle situations where files with full dates and partial dates are to be processed in the same context.

And, based on experience, the majority of users only ever works with complete dates. For scanned photos, it is fairly easy to add a date and time original / digitized / created either with the built-in functionality in IMatch, the scanner software or ExifTool. This will then arrange the files properly in the time line. For other cases you can usually use something like 1.1.1900 or (1900) or 1.6.1900 (second half of 1900) to produce a complete date. For the purpose of including these files in the IMatch timeline this is usually sufficient. You can further mark these photos with a keyword, comment, Attribute, dot, pin, bookmark, flag or a special XMP label to make it easy to find and identify photos with uncertain dates.

There are basically two approaches to handle uncertain/incomplete dates in the long term:

1. Enhance the timeline and all date-related functionality in IMatch to handle uncertain dates (lots of effort)
2. Add a new view to view files on a separate timeline which is designed for only this purpose.

For the second approach, IMatch Apps may be the solution. There are several quite capable time line JavaScript libraries available which could be used in an IMatch App. For example:

http://www.simile-widgets.org/timeline/

http://timeline.verite.co/

(scroll down to see the examples)

The data for these scripts is usually supplied via JSON structures, which can be easily filled by an IMatch 5 from whatever data is in the database, metadata or Attributes alike.


dcb

Quote from: Richard on August 04, 2013, 07:35:41 AM
Entering estimated dates in IMatch has always been a pita but I developed my own work around.

Richard, what was your work-around? I had a category tree of year-month-day which meant I could add something just to the year-month level if that's all I knew. It was never perfect and couldn't cover the options that the EDTA mentioned above does. It also needed a script to populate and then a property field to say, "If you run the script, don't modify this value". It also needed another category to say, "This is an imprecise date".

Quote from: Mario on August 04, 2013, 09:57:06 AM
Well, in fact the timeline is able to handle arbitrary groups of files. It's a matter of how the files are processing on import and how the individual nodes in the timeline are created. The current time line has been designed to work for standard dates using in archiving and imaging. For images it uses one of the three EXIF timestamps, for other file formats it uses other date information.

For the standard functionality I don't think it should be any other way Mario. The implementation matches somewhat the system I had with date categories created with a script -- except now I don't have to worry about the script.

Quote from: Mario on August 04, 2013, 09:57:06 AM
XMP and (with limits) IPTC and EXIF support incomplete dates, e.g. a date consisting only of the year. But supporting this makes things so complicated on so many levels (starting with ExifTool which does not support them) to all standard Windows calendar functions, comparison routines for dates, sorting by dates, all that becomes more complicated or has to be developed from scratch in order to handle situations where files with full dates and partial dates are to be processed in the same context.

True enough. And it still doesn't handle marking these as unknowns vs approximations.

Quote from: Mario on August 04, 2013, 09:57:06 AM
And, based on experience, the majority of users only ever works with complete dates. For scanned photos, it is fairly easy to add a date and time original / digitized / created either with the built-in functionality in IMatch, the scanner software or ExifTool. This will then arrange the files properly in the time line. For other cases you can usually use something like 1.1.1900 or (1900) or 1.6.1900 (second half of 1900) to produce a complete date. For the purpose of including these files in the IMatch timeline this is usually sufficient. You can further mark these photos with a keyword, comment, Attribute, dot, pin, bookmark, flag or a special XMP label to make it easy to find and identify photos with uncertain dates.

I'm only talking about 1% of my photos here. I do expect it will grow to 10% as I scan in older images. Richard's case sits further away from the "majority of users" by the sounds of it.

My "imprecise" images will show up in the timeline with the scanned date modified to be as close as I can get it. Then I'm using one version of what you suggest above. A comment that is then used by a data-driven category (screenshot attached). I would rather use an XMP field over a pin or flag because that will stay with the image (once metadata is written back of course).

Quote from: Mario on August 04, 2013, 09:57:06 AM

2. Add a new view to view files on a separate timeline which is designed for only this purpose.

For the second approach, IMatch Apps may be the solution. There are several quite capable time line JavaScript libraries available which could be used in an IMatch App. For example:

http://www.simile-widgets.org/timeline/

http://timeline.verite.co/

(scroll down to see the examples)

The data for these scripts is usually supplied via JSON structures, which can be easily filled by an IMatch 5 from whatever data is in the database, metadata or Attributes alike.

I've used the simile timeline in the past within a wiki. It's a great tool and a good idea to suggest it. We would still need some kind of field to store the data. Apps are something for me to look at. 20,000 images might also slow things down. Never know until we try.

Thanks, David.

[attachment deleted by admin]
Have you backed up your photos today?

Richard

QuoteRichard, what was your work-around?

01/01/1850 12:00 means it happened after this date
12/31/1850 12:00 means it happened before this date
06/30/1850 12:00 means it happened sometime in 1850

In IPTC I would use the Fixture Identifier field to enter things like C. 1850, b. 1850, a.1850 or 1840-1860 to clarify the date I had entered for "Created". I have not decided how I will handle date approximations in IMatch 5 but it is likely that I will create a text field in Attributes where I can use text to explain my numbers.  I am already using Attributes to show: Birth, Baptism, Marriage, Death, and Burial dates for photos of relatives. I can use these dates to help guess when a photo was taken.

jch2103

Quote from: dcb on August 04, 2013, 05:14:01 AM
...
Further searches led me to the Library of Congress Extended Date/Time Format (draft) or EDTF for short. This I do like because it extends on ISO 8601 standard quite well.

Under this approach, photos of a formal school ball that I know were from my late school years can be marked as 1987-11~ (Approximately Nov 1987). A photo from Christmas morning where the year is totally unknown could be uuuu-12-25. If I know it to be in the early 2000's (from memory) I might write 200u-12-25. Then as I honed in on a date it might become 2004-12-25~ (Approx 2004)
...
Place the EDTF date in the comment field (I still want a better field for this)
...
There will never be a right or wrong way to do this. What works for you, works for you. I hope this will help you find what that is.

-David

Thanks for your extended post - very interesting for those of us coping with scanned images, etc.

Be sure to post if you figure out a better field to store EDTF data in. Maybe you can create a standard for IM5 users to follow...

John
John

dcb

Quote from: jch2103 on August 04, 2013, 11:51:45 PM
Be sure to post if you figure out a better field to store EDTF data in. Maybe you can create a standard for IM5 users to follow...

I've been hunting for a better field and have found one. The Dublin Core Coverage field is my new favourite. From the specification, "Coverage will typically include spatial location (a place name or geographic co-ordinates), temporal period (a period label, date, or date range) or jurisdiction (such as a named administrative entity). Recommended best practice is to select a value from a controlled vocabulary..."

I will personally stick to the Library of Congress Extended Date/Time Format (EDTF) as it meets the needs of my library, particularly if I use the Level 2 version. There is no standard within the standard on how to reference it so I'll go with the Dublin Core Coverage field containing values such as:


  • EDTF2:1987-09~
  • EDTF2:uuuu-12-25
  • EDTF2:1958-(02)?

View this with a data-driven category called "Imprecise dates" set as follows (I may improve on this as I get more data)




  • Basic Settings:Tag = XMP::dc\coverage\Coverage\0
  • Formatting:Use part of value = Yes
  • Start and Length = 4,15

The 4,15 strips the leading EDTF2: and takes the next 15 characters - more than enough. I'm unsure however why the start value had to be 4. I think it's a bug but I'm not sure. I would have expected 6 (or 5 if zero-based).

The resulting category can be seen below.


[attachment deleted by admin]
Have you backed up your photos today?

Russell

Hi Guys,

I've got a lot of scanned images too - family history type things, primarily... so I have a whole cluster of Categories/Tags to work with these kinds of things.  Hopefully this is of value to someone searching... Here's what I do:

1.  Date and timestamp the file as close as I can.  If I only know the year, it'll be marked as 7/1/YEAR (an approximate midpoint of the year).  My times are always MIDNIGHT if I don't know them, since there's unlikely to be any photos with that real time.

If I know the Month, but not the date, I simple trade the 7 for the actual month.  If I don't know a year, I simply guess it so my files will work on timelines and other processes.

2.  My categories/tags look like this:
Month Verification
-- Approximate
-- Calculated from Note on Media
-- Exact                (like a newspaper article, known graduation date, birthday, etc.).
-- Noted on Media

Year Verification
-- Approximate
-- Calculated from Note on Media
-- Exact
-- Noted on Media

3.  I also store the Month and Year in my metadata - as best as I can guess it (as explained in Step 1.).  Then, in my categories and metadata, I have the month, year, and the verification for both (sometimes I know the exact year, but don't know the exact month).

And my files are timestamped based on Step 1, so they'll sort correctly.  (I store all my files in Year folders, then Month subfolders (and sometimes day of the month subfolders).

Hope that's a helpful idea to others,

Russell

dcb

I've just come back to this topic. I like your ideas Russell of the category splits. Will consider that today as I hit dating several hundred not known or inexact files. At the moment they are all grouped with metadata timestamps placing thing in the 30 Dec 1899 slot.
Have you backed up your photos today?

Erik

I know it's an old topic, but it's one I've tried to deal with in the past, but I haven't really looked into recently (haven't been scanning images lately).

For background, most images in my DB have precise years.  When other parts of the date are imprecise, I used 00 to represent the unknown quantity (month or date).  THe caveat is I did this many years ago, and I'm not sure if EXIFTool would prevent this now.

For instance:  I would use 00 as the date if I know the year and month: e.g. 2010-01-00 means the image took place in January 2010, but I don't know when.  The time for that is somewhat arbitrary.  If I didn't know the month I think I was using 00 as well, but I also know at one point I was arbitrarily using February 30 as a date when all I knew was the year (since there is no such thing as February 30, but most software doesn't know that).  The non-real date meant that there was no confusion.  At one point I was toying with the first of a month at midnight, but then I started finding issues with some astrophotography I was taking.

Separately, In my IM3.6 db, I have a manual category under date called Quarter that roughly divides a year into 4 chunks, roughly corresponding to seasons.  This allows me to qualitatively bin images so that those without known dates could be find roughly through my category tree.  This never made it to metadata, and I haven't really ported it over to my test DB's in IM5.  I also suspect that EXIFTool probably has better data validation.  I was only ever writing the dates to EXIF, so I might not have been using EXIFTool in those days as I think IM3.6 did metadata writing in EXIF with its own editor.  It's amazing how little you can use things once you can rely on a camera to fill all the data for you.

dcb

It's the past that didn't cater for the future. If only cameras had always had accurate and complete metadata.  ;)
Have you backed up your photos today?

Elinor Adman

Hi- I'm a 2-day-into-it user of IMATCH 5 and have a pretty good image collection in IMATCH 3.6  of my personal images, plus some involved with genealogy.  Probably 1/2 of my images are scanned, and therefore don't have exact dates, let alone times.  I built categories decade.year.month, but never put this info into any metadata. Similarly I have location.continent.country.state.city.sublocation categories.  I would love to know the best way to get this info into XMP files. I suspect some of my Content categories could well make it into keywords, once I know how to port the category info elsewhere! 
Many thanks for any help you can offer :)
-Ellie

Mario

QuoteI built categories decade.year.month
...
I have location.continent.country.state.city.sublocation categories

1. IMatch 5 has effective category system like IMatch 3. And a lot more features. There is no real need to change your approach if it worked so far.

2. You can use the Metadata Templates feature in IMatch to copy arbitrary data into metadata tags. The data source for the copy operation can be a variable, or a combination of variables. Using the powerful variable formatting functions you can extract all or parts of the category name, and then copy the result into keywords. For your location data, check out the location tags country, state, city provided by XMP. These are the best fields to hold such data.

3. You can use the VarToy App (in the HTML Panel) to try out variables. This gives you immediate feedback, e.g. when you play with variables which allow you to access your categories. If you are satisfied, you can copy the variable in a  Metadata Template and choose the target metadata tag.

Read the help topic on variables in the IMatch help, it has many examples, also for categories.


Damit

I know this is an old thread but i am trying to develop a strategy for this and while there are some good ideas here, I am wondering if more can chime in or if a better strategy can be enacted using IMatch. I am really open to any ideas or suggestions.

JohnZeman

Well my system isn't anything special, for old photos taken way back when cameras didn't have an internal clock is to simply add a note at the end of the description that says

"Date and time were wildly estimated as this photo was scanned many years later."