Disaster Strikes: Power outage -> Corrupt DB -> Recovery

Started by Ferdinand, January 16, 2015, 03:46:00 PM

Previous topic - Next topic

Ferdinand

The purpose of this post is just to vent.

Something that I've always been worried about happened a short time ago.  I had a sudden momentary power outage.  Because these things happen, I've trained myself not to leave my live DB open in IMatch if I'm not using it.  However on this occasion it was open as I had just been using it. 

I have a UPS that is supposed to protect me from these things, but for some reason yet to be determined, it let me down on this occasion.

And sure enough, on a restart the DB was unrecoverably corrupt.  So I had to restore the last backup of a couple of days ago from my Trueimage incremental backup of that date.  Success, and mercifully I had not lost much as I had been working on other things, or using the DB but not changing anything much.  The hard and risky part is remembering what you changed.  If you've been trawling through images and flagging them here and there, you have to try and remember those choices.

So things worked roughly as they should.  Even so, I still regard this as IMatch 5's Achilles' Heel.  We have a lot of power that we never had in IMatch 3.6, but there's a risk we never had either.  No matter how often you backup, when this inevitably happens, like death and taxes, you're pretty much guaranteed to lose something, and the only question is how much?  I was lucky this time.

Mario

IMatch 5 uses the same database system as IMatch 3.5 and 3.6. Just a more modern version of it - the same database system used by Apple, Google and others.

I regularly simulate Windows crashes power failures (as good as possible without endangering my hardware, e.g. by just stopping a VM while IMatch is writing data). The databases usually survive such things without any trouble.

IMatch sets up new databases to use the 'Normal' synch mode which is the same as used in IMatch 3.6. The default mode would be FAST, which means less synching (forcing Windows to flush data to the physical disk and wait for Windows to finish) and more performance. If you experience problems on your system, and you use the FAST Default mode, try switching to Normal. I use all databases in FAST mode (local disk, USB stick and NAS).
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

StanRohrer

This is the reason I have Pack N Go backup set as a question for every IM shutdown. I've been burned a couple of times. So now I do the backup after any significant work and even multiple times on a busy day. My backup (with optimize) takes 8-10 minutes so I just try to pick a time when I'm off to another project for a bit. My corruption issues have included power glitches and IM glitches (fortunately getting fewer as we get updates). Better safe than sorry.

JohnZeman

Boy this brings back memories and not good ones because a year or two ago almost the exact same thing happened to me.  Only in my situation it WAS my UPS that caused the loss of AC power to my computer.

This was back in the beta testing days when we regularly had to trash our test databases and create brand new ones as Mario made major changes to the program.  So at the time I was only using a couple hundred images, not the over 50,000 I have in my database now.

But the bottom line is when that sudden power outage happened I lost it all, everything.  Database, preference settings, everything.

Which is one of the reasons why I am almost fanatical about doing daily backups and imaging of my entire computer system.

Richard

Quote from: StanRohrer on January 16, 2015, 04:55:53 PM
So now I do the backup after any significant work and even multiple times on a busy day. My backup (with optimize) takes 8-10 minutes so I just try to pick a time when I'm off to another project for a bit.

8-10 minutes to safeguard an hour or more of work seems like a cheap premium to pay for insurance.

Mario

I still wonder how to track this. The database vendor gives certain guarantees under which conditions a database may become corrupt, and when not. Basically it boils down to this:

If the database system writes data, Windows acknowledges the data as "written" but fails to physically flush the data to the disk because of a power failure, the database will be corrupt. If the corruption happens in a rarely used location, only a full diagnosis is able to reveal the problem.

IMatch crashing, the user forcefully stopping IMatch in mid-run while it is writing data (I do that many times a day) etc. will not damage a database - unless the user deletes the journal files in the database folder before IMatch restarts and had a chance to rollback the open changes.

I still work with databases created during the Beta (!) and I also work with IMatch on a Windows tablet computer which goes to sleep often or even shuts down hard because the power runs out.

I can damage (sometimes) a database by triggering something that's causing many writes, keeping the database on an USB disk and then pulling the USB stick out while the LED is blinking - but that will damage any data currently being written. I use this method to determine if the diagnosis routines built into the database system find damage and if IMatch recognizes the damage as soon as the DB reports it order to tell the user.

I run a similar test with my NAS, jiggling the network cable or clicking the WLAN button to turn the WiFi of while IMatch is busy.

Stopping IMatch in the debugger or closing it in the Task Manager will not harm the database because Windows will finish all pending write-operations, and the next time IMatch runs it will "rollback" incomplete operations.

All users who ever have experienced a damaged database should switch the Edit > Preferences > Database: synch-mode to "NORMAL".
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Ferdinand

#6
Quote from: Richard on January 16, 2015, 06:46:24 PM
8-10 minutes to safeguard an hour or more of work seems like a cheap premium to pay for insurance.

You've often said this Richard, but it's not quite that simple.  I said "No matter how often you backup, you're pretty much guaranteed to lose something".  What I meant is that unless you're very, very lucky and the outage happens just after your last backup, then you will have changed something in the DB, which you lose.  The question is, how much work did you do since the last regular backup?  Regular, frequent backups will minimise loss, but not prevent it.

I still consider this to be IMatch's Achilles' Heel. How would I feel if I recommended IMatch to a friend, as I often do, and they experienced total loss.  Of course I also warn them about backup, but I can't control what they do, and Murphy's law says that crashes generally happen at the worst possible time.

Quote from: Mario on January 16, 2015, 03:59:30 PM
IMatch 5 uses the same database system as IMatch 3.5 and 3.6. Just a more modern version of it - the same database system used by Apple, Google and others.

Yes, but it interacts with it differently, does it not?  V5 interacts with the DB in the background in ways that V3.6 didn't.  As I was restarting the PC I was trying to recall whether the DB had been open at the time of the outage, and if so, had I been doing anything recently that might have resulted in background activity?  The DB is on a fast SSD, so background activity should be quick.  Well, it had been open; but I don't think I had been doing much at that point, and yet there must have been enough going on in the background to cause corruption.

Sync mode was normal.  I'll switch to the paranoid mode and see what sort of performance hit there is

Richard

"I said "No matter how often you backup, you're pretty much guaranteed to lose something"."

I know of no insurance policy that will guarantee that a user will never suffer a loss. How much you spend on insurance can only serve to minimize the loss, not eliminate it. If I have spent an hour making changes to my database and that database is lost, YES I will lose something. An hours worth of work. You could consider it a deductible. If I wait days to make a backup, I will lose days of work. I prefer to keep my policy current so that I do not lose much. At my age I will have enough trouble remembering what I did during one hour, much less hours or days.

RalfC

Quote from: Ferdinand on January 16, 2015, 11:18:14 PM
I said "No matter how often you backup, you're pretty much guaranteed to lose something".  What I meant is that unless you're very, very lucky and the outage happens just after your last backup, then you will have changed something in the DB, which you lose.  The question is, how much work did you do since the last regular backup?  Regular, frequent backups will minimise loss, but not prevent it.
This keeps true for any kind of data / file which is written (to any type of storage media). I agree though that the work, which is invested into a database, may be much bigger than with programs like MS Office (or any  other program), which perform an automatic save in defined intervals to keep the loss small [but typically the data amount needed to be handled is also much smaller].

But if the storage media dies for some reason, automatic saving does not help either.

Quote from: Ferdinand on January 16, 2015, 11:18:14 PM
I still consider this to be IMatch's Achilles' Heel.

Here I disagree somewhat, as IMHO [in my humble opinion] it is a general problem when working with computers.

Regards,
Ralf

Ferdinand

My comment about Achilles heel was in comparison to V3.6.  In that version, I have no doubt that if a power outage of the sort that I experienced were to happen while it was writing to the DB then the DB would be corrupted.  That's just bad luck and you have to have backup to get most of your work back.  But if you were to leave V3.6 open doing nothing there was no risk.  Zilch.  That's not the case in IMatch 5.  That's the Achilles' Heel.  You just leave it open seemingly doing nothing.

Is it the case for other programs?  No and maybe.  As was pointed out, Word and other programs do an auto save.  In most programs, if the open file has been saved, then either there's no problem or there's a recovery option.  There aren't many programs that I can think of where the open file will be corrupted in a loss of power or BSOD.

Mario

You are comparing Apples and Oranges here...

If you work in Word and don't save your changes and you cut the power, you will lose all your changes. To minimize the risk of that, Microsoft introduced the auto save feature to at least save the document every n minutes if the user does not do it.

If the power is cut while Word is saving the changes, two things may happen:

a) If Word is saving the changes into the original file, it is most likely corrupted.
b. If Word is saving a copy of the document to a temporary file, and then deletes the original and renames the original file, no harm should be done. At least unless the power failure happens during the move/rename operation and Windows is unable to update the file system correctly.

IMatch is basically a database server with a GUI. You need to compare it not with a Word processor but with databases like Oracle or MySQL or SQL Server. If these systems write data, the disk acknowledges the data as written but fails to physically commit the data to the disk due to a power failure, your database will be corrupted and you need to restore from backup.

The database system automatically commits changes to the disk at certain intervals, and every time IMatch closes transactions. Depending on what you do, IMatch performs thousands of writes, updates, and transactions per minute. IMatch 3 also did updates in the background, e.g. category updates. But less so than IMatch 5, because IMatch 5 does a lot more. This means that when you experience frequent power failures or disk problems, it is more likely that an IMatch database is affected by a power failure or disk problem. But that's in the lower single-digit percentages, if at all.

I have carefully chosen the database system IMatch uses at it's core. I use it since IMatch 3.5 (2006). Google uses it for Android. Apple uses it for iOS . Adobe uses it and many other companies as well. It is on the market for a dozen years and under constant development, sponsored by the big companies in the IT business.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

RalfC

I agree that any database is more vulnerable to power outage or or disk failure than a simple program, which mainly waits for user input, due to the type of interaction and the amount of data, which needs to be written.

But I would not consider it a weakness for any database.

Quote from: Mario on January 17, 2015, 04:41:47 PM

The database system automatically commits changes to the disk at certain intervals, and every time IMatch closes transactions.
The synch mode selection changes the interval, I suppose.

Quote from: Mario on January 17, 2015, 04:41:47 PM
I have carefully chosen the database system IMatch uses at it's core. I use it since IMatch 3.5 (2006). Google uses it for Android. Apple uses it for iOS . Adobe uses it and many other companies as well. It is on the market for a dozen years and under constant development, sponsored by the big companies in the IT business.

My guess would be that you use SQLite. Among others (beside the mentioned), Dropbox and Skype are using it and Airbus uses it in the flight software of the A350 (https://sqlite.org/famous.html).
I think, it is fair to say that the DB system is robust.

Which leaves the risk to unepected failure of the system on which IMatch is running, but it should be in general rather small.

Regards,
Ralf

Mario

QuoteThe synch mode selection changes the interval, I suppose.
Correct. In the paranoia mode, every write is synched physically, but that's a performance killer and only needed in really unstable environments or on unreliable computers.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook