The process that created pglogical
pglogical (logical replication for PostgreSQL) is the latest in the series of awesome products developed & supported by 2ndQuadrant. One of the key ingredients to making any product great is the process followed in developing it. We have tried to notch up our game with pglogical, let me describe some of the measures we have taken to ensure reliability.
Version control
Like all our other PostgreSQL tools, pglogical is hosted on 2ndQuadrant’s private github. Version control tools like github not only have provisions for team coordination, allowing multiple people to work collaboratively on the same project, they also allow for maintaining multiple branches and multiple tags. This ability is extremely important to be able to support and patch production releases, sending out hotfixes that are specific to each release. pglogical is new, but this ability will have tremendous impact long-term. In fact, we are already working on the 1.0 branch to send out a patch release with bug fixes.
Change tracking
Using a repository is a given for version control, but it needs to be combined with change tracking in order to be able to plan releases and keep your sanity while sticking to the plan! I guess github realizes this and hence has given a module dedicated to change tracking along with its legendary web-based version control. For pglogical, we took the following steps to plan out the release:
1) Break up TODOs into granular items logged as individual tasks
2) Create release buckets as per priority
3) Shift the granular tasks to the appropriate release bucket
4) Assign the granular tasks to the developers
The same tools served to track the changes all the way through to patch review, commit, testing, & closure.
Continuous integration
In order for any piece of software to work, all its modules need to come together and integrate well in the form of a well-oiled machine. For products like pglogical, we need to ensure that the entire process is automated so that binaries are reproducible, reliable, and void of any nasty surprises. Through a continuous integration process, pglogical has been made available in the form of RPMs for Red Hat family of Linux and in the form of DEBs for all Linux based on Debian using the standard apt-get and yum packages. These packages have been made available via 2ndQuadrant’s own repository. The beauty of making CI a standard process is the ability to create the exact same binaries on demand at any point in time in the future.
Documentation
pglogical comes with extensive documentation including installation guides, READMEs, & FAQs. We are also working to make available setup guides so that our users need minimal outside help to setup logical replication for their PostgreSQL server.
Testing, testing, testing … and then some more testing
I believe testing is perhaps the most under-rated aspect of open source software development. Mind you, most open source developers are absolutely brilliant and their code is impeccable. They are, however, human beings and that always leaves the possibility of some oversight somewhere. We incorporated heavy-duty testing in every step of the way during the development process. Starting from unit testing by the developers, we had dedicated resources verify functionality of all granular tasks and automate their tests in a centralized automated testing suite. This suite will serve to test for regressions for times to come.
Not only the software, we also tested our installation process along with the packaged RPMs and DEBs extensively. We take pride in the software we produce and that makes user experience very important to us; it has to be smooth and satisfying.
Conclusion
Overall, we probably didn’t get it perfect but we are striving hard to continuously improve processes around the development of our products resulting in making them more reliable than ever before. If you have any feedback to help us improve, I would love to hear from you!
pglogical is awesome. we are testing it with some big volume data, could you please share some knowledge about conflict_resolution? since the database is big, we don’t want to rebuild slave database if there is only a small table has conflict problem.
During our testing, we set pglogical.conflict_resolution=’error’, then if there is conflict, the logical replication is stopped, (this is what we want), however, even if we remove it from replication set, and add the table back, the slave still has problem, and the replication can not work anymore.
There is no document about how to solve the conflict issue on website, could you please share the correct steps about how to fix the conflict issue? thanks
James, thank you for your interest and compliments. We can’t answer every question raised, though 2ndQuadrant’s training and consulting can provide solutions for you. If you are interested, please write to us at [email protected] and one of my associates will surely get back to you.
HI, I got it to work, but had these problems:
1. Had to make the subscriber a master, not a slave, or else I got errors indicating I could not create tables.
2. The default max_worker_processes had to be increased to 15, from 10, in my case.
3. I could not get it to work if I prepopulated the tables on the subscriber node. I had to create the database on the subscriber and let pglogical create and populate the tables.
Michael, thank you for your interest in pglogical. We can’t answer every question raised, though 2ndQuadrant’s training and consulting can provide solutions for you. If you are interested, please write to us at [email protected] and one of my associates will surely get back to you.