Near-Zero Downtime Automated Upgrades of PostgreSQL Clusters in Cloud (Part I)
Last week, I was at Nordic PGDay 2018 and I had quite a few conversations about the tool that I wrote, namely pglupgrade, to automate PostgreSQL major version upgrades in a replication cluster setup. I was quite happy that it has been heard and some other people in different communities giving talks at meetups and other conferences about near-zero downtime upgrades using logical replication. Given that there is a talk that I gave at PGDAY’17 Russia, PGConf.EU 2017 in Warsaw and lastly at FOSDEM PGDay 2018 in Brussels, I thought it is better to create a blog post to keep this presentation available to the folks who could not make it to any of the conferences aforementioned. If you would like to directly go the talk and skip reading this blog post here is your link: Near-Zero Downtime Automated Upgrades of PostgreSQL Clusters in Cloud
The main motivation behind developing this tool was to propose a solution to minimize the downtime during major version upgrades which unfortunately affect almost everyone who uses PostgreSQL. Currently, we don’t have a tool that allows PostgreSQL users to upgrade their databases without downtime and it is clearly a pain point for many users, especially businesses. And, if we are to solve the upgrade problem we should think of more than one server (will be referred as a cluster from now on), simply because not many people use only one database server anymore. The most common scenario is having physical streaming replication setup either for high-availability purposes or scaling the read queries.
Database Upgrades
Before diving into the solution, let’s discuss a bit about how database upgrades work in general. There are four main possible approaches to database upgrades:
- The first approach would be for databases to keep their storage format same or at least compatible across versions. However, this is hard to guarantee long-term as new features might require changes in how data is stored or add more metadata information to work properly. Also, performance is often improved by optimizing the data structures.
- The second approach is to make a logical copy (dump) of the old server and loading it into the new server. This is the most traditional approach which requires the old server to not receive any updates during the process and results in prolonged downtimes of hours or even days on large databases.
- The third option is to convert data from old format to new one. This can either be done on the fly while the new system is running, but incurs performance penalty which is hard to predict as it depends on data access patterns, or it can be done offline while the servers are down, again incurring prolonged downtimes (although often shorter than the second method).
- The fourth method is to use logical dump for saving and restoring the database while capturing the changes happening in meantime and logically replicating them to the new database once the initial restore has finished. This method requires orchestration of several components but also decreases the amount of time the database cannot respond to queries.
Upgrade Methods in PostgreSQL
PostgreSQL upgrades can be matched with the four methods listed above. For example, PostgreSQL minor releases, which do not contain new features but only fixes, do not change the existing data format and fits the first approach. For the second approach, PostgreSQL provides tools called pg_dump and pg_restore which do the logical backup and restore. There is also pg_upgrade tool (it was a contrib module and moved to the main tree in PostgreSQL 9.5) which does offline (the servers are not running) conversion of the old data directory to the new one, which can fit into the third option. For the fourth approach, there are third party trigger-based solutions such as Slony for upgrading, but they have some caveats which we will cover later.
Traditional methods of upgrade can only upgrade the primary server while the replica servers have to be rebuilt afterward (offline conversion). This leads to additional problems with both cluster availability and capacity, hence effectively increasing the perceived downtime of the database from the point of both applications and users. Logical replication allows replicating while the system is up-and-running and testing effort can be handled in the meantime (online conversion).
There are different upgrade methods applicable for different environments. For example, pg_dump allows upgrade from very old versions (i.e 7.0) to recent releases, so if you are running an antique version of PostgreSQL, the best method is using pg_dump/restore (and better to do it now!). If your PostgreSQL version is 8.4 and later you can use pg_upgrade. If you happen to be using PostgreSQL 9.4 and later this means you have in-core “Logical Decoding” and you can use the pglogical extension as means of upgrade. Finally, if you are using PostgreSQL 10, you will have a chance to upgrade your database to PostgreSQL 11 and later (once they are available) by using in-core “Logical Replication” (yay!).
Logical Replication for Upgrades
Where is the idea of upgrading PostgreSQL by using logical replication coming from?Clearly, pglogical does not claim the upgrade purpose openly like pg_upgrade does (this was one argument against pglogical at the conference party), but this does not mean that we cannot use the tool for upgrades. In fact, when pglogical was designed, near-zero downtime upgrades was considered as one of the expected use-cases by the developers of the extension.
Unlike physical replication, which captures changes to the raw data on disk, the logical replication captures the logical changes to the individual records in the database and replicates those. The logical records work across major releases, hence logical replication can be used to upgrade from one release to another. The idea of using logical replication as means of version upgrade is far from being new, triggered-based replication tools have been doing upgrades in a logical way by capturing and sending the changes via triggers (because triggers can work regardless of database versions, too!).
Note: You can check my Evolution of Fault Tolerance in PostgreSQL presentation about how replication works in PostgreSQL.
If you can use trigger-based replication solutions for minimum downtime upgrades why should you prefer logical replication instead? In a trigger-based mechanism, all the changes are captured by using triggers and written into queue tables. This procedure doubles the write operations, doubles log files, and slows down the system since all the changes have to be written twice; resulting in more disk I/O and load on the source server. In contrast, in-core logical replication (same goes for pglogical extension) is built on top of logical decoding. Logical decoding is a mechanism that extracts information from WAL files into logical changes (INSERT/UPDATE/DELETE). Since the data from WAL mechanism is used by decoding the transaction logs, there is no write amplification as in the case of trigger-based replication solutions, hence this method performs better. As a result, logical decoding based solutions simply have a lower impact on the system than trigger-based solutions.
Conclusion
In this introductory post, I wanted to give an idea why we should consider using logical replication to achieve minimal downtime during major version upgrades. We will continue with the details of the architecture and implementation of the tool in the upcoming post.
You mentioned pg_upgrade as a contrib module, but it was moved from /contrib to the main tree in Postgres 9.5.
Thanks, Bruce. I’ve updated the post accordingly.
Would you share pglupgrade ansible roles on github or somephing else?
Why pglogical lose his state after run pg_upgrade and install postgresql-10 over postgresql-9.5 ?
Hi Eugene,
We are using the tool in 2ndQuadrant and unfortunately, I can’t provide the code to the public yet. That’s why I am writing how it works so that you can build your own tooling. You can find more details in the second part of the series which will be published this week.
For the second question, I did not quite get it, would you explain the problem better, please?
Thanks.
pg_upgrade does not preserve replication slots or replication origins, which are the position tracking facilities used to maintain replication state in pglogical and other logical replication tools. It wouldn’t make sense to preserve them because they track positions based on log sequence number (LSN), which is the replication log position. When pg_upgrade creates a new cluster it makes a new replication log that (re)starts at 0, rendering these positions meaningless.