Oracle High Availability concepts in PostgreSQL
Sometimes people ask about certain Oracle High Availability features and whether they exist in PostgreSQL.
In most cases, very similar features exist. The reason for the similarity is that PostgreSQL and Oracle have very similar architectures and so the mechanisms to protect data have also developed along the same lines.
For example, Oracle Data Guard is streaming replication of the transaction log, so is very similar in concept to physical streaming replication in PostgreSQL. Active Data Guard is where Oracle users can run queries on a standby server, which again is similar in concept to Hot Standby.
Oracle RAC? You don’t really need Oracle RAC. Or at least qualified Oracle experts tell me so and my own hands on experience confirms that. Not least because disaster recovery for RAC needs to be provided by Data Guard (so why not just use Data Guard). So there is no equivalaent of RAC in the world of PostgreSQL. (Phew!)
Looking for RMAN? There is a backup and recovery tool called Barman which takes advantage of the built-in features for Point in Time Recovery.
There’s more detail there also but a fully detailed explanation is something to discuss as part of technical training or a migration project. The main message is that PostgreSQL supports HA with a range of detailed features that have been in production for many years.
What would be great in PostgreSQL would be an equivalent of Oracle Flashback.
When you compare the difficulty for example to do a PITR in PostgreSQL (and find where the table X has been dropped and then restore manually) compared to a simple “FLASHBACK TABLE xxx TO BEFORE DROP”
That would just be amazing, I don’t know any other database system offering that.
I’m planning to work on that in 2018.
I am a postgres, and I admin also Oracle RAC cluster.
Stream replication and hot standby are very nice feature.
But replication without switchover is not very useful….
With cluster active active like grid infrastructure et RAC database option you have good solution to protect the service.
The hight redundancy disk choice protect you against disks failure.
And I hope BDR will be the solution to have no interruptions of service in case of crash of server.
Or Lost of disk.
Standby DB are a solution for Lost site in case of ?….
Even though HA is available in PostgreSQL. Just some out of discussion question regarding Oracle RAC Support. Whats about Oracle RAC support for scalability, and agility for any application ?
Nice DR comparison between Oracle and PostgreSQL, but HA failover seems to be missing here. Oracle RAC provides HA failover in addition to HA DR (Data Guard), whereas core PostgreSQL does not (manual failover only). Even more interesting about Oracle RAC HA failover is that it provides failover of transactions that have not even completed yet!
Various failover managers exist for PostgreSQL. 2ndQuadrant provides repmgr for that task.
A quick fact-check on the failover thing shows that transactions involving INSERT, UPDATE, DELETE are not supported by the Transactional Failover mechanism you refer to. Plus it doesn’t support Temp objects or Package variables. So there’s not much for PostgreSQL to worry about – its the feature itself that’s Transparent.
Note that Application Continuity in Oracle 12.2 now goes a long way beyond traditional TAF and can support read-write transactions (with certain caveats).
It’s all in the caveats
Good Catch on the transaction failover stuff, Simon. True, Oracle does not provide automatic failover for in flight DML statements (INSERT, UPDATE, DELETE), but it does for SELECT statements.
Re various failover managers: Could you please position repmgr in relation to Patroni? (This might also be worth a separate blog post.) Thx.
Will write an article about that 🙂
My general advice : DON’T USE RAC. It is vastly overused. I asked several times to RAC users : “Do you evaluate the gain of availability obtained with RAC ?” I rarely obtained answers.
From my experience, RAC can give a global GAIN in terms of planned downtime but this gain could mostly have been obtained with Dataguard only.
Unless you are very lucky and/or very good and/or very rich it will gives you a global LOSS in terms of unplanned downtime
Last time I discussed with a RAC specialist, I was very skeptical about the choice of our common customer to use RAC.
RAC specialist gave me the usual answer “You are not confident because the previous integrators were bad but with me it will be different. I asked the customer to buy new storage, I will follow the best practices and things will run”.
The previous integrators were platinum partners of Oracle and as I expected the specialist did nothing better than them. We had more incidents causing downtime with the RAC database than with the single instances databases left on the old AIX server.
The reasons are easy to understand but often ignored. A single Oracle Database instance is not simple but RAC completely breaks the KISS principle, it is a complex thing to deploy and to administer. You can have as many nodes as you want, Grid Infrastructure itself will remain a logical single point of failure. Every software has bugs, especially if it is complex and Grid infrastructure is no exception.
It does not even fully protect you from hardware issues. RAC is good to detect and protect from complete hardware failures but you are more exposed to intermittent failures and performance loss than can end in a global freeze.
If you build your own RAC, it introduces potential issues you would have never noticed. For example, performance issue on cluster network can slow down the entire thing to a point it’s not practically usable anymore. Several times I just had to shut down one node to temporarily solve a performance issue on the RAC of inexperienced customers (hint : gc* wait events were on top of the performance reports but a ping would have been enough to understand things could not go well…)
Paradoxically, it means you need to have a perfect storage, a perfect network etc. to reach high availability with RAC. But in fact if everything is perfect… the only SPOF left will be RAC software itself in the end.
About scalability, it’s the same thing. Oracle RAC can be good..if you are rich. Don’t expect to aggregate your previously bought cheap hardware (servers, network, disks) and to scale as you want. RAC can run great on Oracle Cloud or at least Oracle Hardware but I noticed customers are more and more reluctant to be fully dependent from Oracle. Oracle convinced the market cause it was more “open” than proprietary IBM mainframes after all.
Many workloads can’t even parallelized that easily on a single instance and in this case RAC won’t help at all. Even if workload can nicely be parallelized, RAC global cache is obtained through network, hence you need a VERY good network.
Also think RAC is a “share everything” thing in terms of storage, you can have multiple instances but logically you only have ONE database. Write scalability is not as great as read scalability with this approach.
It means you can scale…if you are rich. But if you are so rich you could probably have paid for a much simpler solution based on a more powerful server with a much bigger RAM, a better storage etc. from the beginning.
I agreed with Phill’s comment, when I had the opportunity to manage environments with Oracle RAC support was complex and determine the performance issues always complicated, even, the configuration is complex and many errors can appear in between. Dataguard is so far more sociable!