How long does it take to change your mind?
You’re clever, which means you’re mostly right about things. But everybody is wrong sometime, so how long does it take for you to change your mind?
People don’t often change their minds quickly. A snap answer is always whatever you were thinking currently. If it was a No, you say No. If it was a Yes, you say Yes. If you answer too quickly you can’t possibly have taken in what was being said to you.
Can you change your mind?
“When the facts change, I change my mind. What do you do, sir?”, often misattributed to John Maynard Keynes.
For Humans, changing your mind based on new information takes days or months. If you have emotional objections, it can take months or years. If anyone has research on that, I’d be interested to see it – the above numbers are really just observation of how us humans behave.
So PostgreSQL adoption has been slow, but it builds over time. Many former users of Oracle or MySQL or other databases now use and accept the PostgreSQL database and I welcome them, as a former Oracle DBA myself.
PostgreSQL itself never changes its mind. Once we COMMIT, that data is there until the next UPDATE/DELETE.
We can’t simply UNCOMMIT a transaction because later transactions depend upon it. In PostgreSQL we literally can’t undo a transaction because we don’t record that information – we use a redo-only transaction log manager.
Someone suggested UNVACUUM to me once. I laughed because my mind was set. It’s only years later that I think about what the user requirement actually was, rather than the specific design proposal used to describe that requirement
PostgreSQL supports Point In Time Recovery, which I wrote back in 2004. Without scripting that can take an effort to recover previous database states.
Which is why I’m thinking about how to do in-database-rollback. The best way seems to be to implement Historical Query and that is something I’ll be working on in 2018. Unless I change my mind.
Simon, are you confident that such functionality is really needed in PgSQL and won’t be lost development effort?
I’ve very rarely seen SELECT AS OF $time (Oracle’s Flashback Query) used even in emergency although in theory it’s a nice thing. Perhaps the reason is that important DBs are nearly always locked-down, audited and backed up so there’s not a lot of requests to see how data looked like some time ago. I’ve used Flashback Transaction Back-out, it’s doable, but in real operations it is risky and complex madness especially when operating on distributed data sets (e.g. microservices).
Just hinting that the ones used and giving more confidence in the solution and in crisis situations are actually much more simple things like “undrop table” (from trash – just renaming instead of dropping), simple views with PITR progress data, easy whole DB rewind-to-past-time options (which could be achieved outside PostgreSQL using snaps).
On first sight it seems to me that Historical Query is different from in-database-transaction-rollback. While the first one can be accomplished by keeping the expired tuples for a configurable amount of time without vacuuming them, the second one is much more complex to attain. First you need to identify all tuples that have been touched by a given transaction, then you have to create an undo transaction that reverts the effects of the previous one. And this can only be done if the soon to be reverted tuples have not been subsumed by new versions of them or have been deleted. But this is not sufficient because a transaction is not only represented by the set of tuples it has changed but also by the set of tuples it has read to make that change. Simply reverting the changes at a later time could violate database or application level constraints. I think this is a problem that can’t easily be solved in a generic fashion, without application level support.
Could this feature be implemented using the same mechanism as pg_rewind; By using full_page_writes stored in xlog?
I’ve actually thought about this before over the years. My mental model says we could do it by having a GUC that lets you define a period of transaction lag that gives you a rolling window that VACUUM will ignore. So long as that duration hasn’t elapsed, the old data is still there. Given that’s the case, you’d just need syntax to retrieve it. It’s definitely a interesting thought, and I’d love to see if it bears fruit.