pglogical 2.1 and Logical Replication in PostgreSQL 10
One of the headline features of the brand new PostgreSQL release out 2 months ago is Logical Replication. Logical replication allows more flexibility than physical replication, including replication between different major versions of PostgreSQL and selective-table replication. You can get more details on the feature here.
So, now that we have this, I’ve been asked on occasion if we are still going to continue develop pglogical and if it’s even needed. I was also asked a couple of times why we put the native logical replication into PostgreSQL when we already have pglogical. I’d like to answer those questions in this blog post.
Why Logical Replication in PostgreSQL 10?
Let’s start with the simpler, or rather shorter, topic of why we added logical replication into PostgreSQL 10.
The main reason is that more people can use it that way. Many people run in some kind of restricted environment, be it because they use PostgreSQL as a service from somebody else, or because their company has strict policies about what can be installed in production, or any other reason. In those cases one often can’t install extensions, so only built-in functionality is available. The more subtle side of this is that while pglogical is now a quite well-know project, there are still many who have no idea that the possibilities it provides even exist. With Logical Replication being a headline feature of PostgreSQL 10, a lot more people will now recognize the possibilities at their disposal when using PostgreSQL.
There are other reasons for this too. The more features we have around logical decoding and logical replication in PostgreSQL, the more they can be used for other purposes than just simple replication. For example, I can imagine a future where we use logical decoding to make table rewrites virtually lock free. It also makes it easier to develop additional infrastructure that’s needed for new logical replication features, since it’s much easier to prove that something is useful when it’s used directly by PostgreSQL.
What about pglogical then?
First, let me assure you that the development of pglogical continues. There are several exciting features in the pipeline for the next major version already. And that leads me to why we still have pglogical… The features!
There will be always features that PostgreSQL does not have yet, or only the latest version has, or may even never have – because the community decides that PostgreSQL itself does not really need them. So here is where pglogical comes in – where, since it’s a smaller project and is more centrally developed, we can add features faster, we can experiment more, and we can also maintain niche features that your company might need.
Let’s see what are the current difference between pglogical 2.1 and logical replication in PostgreSQL 10:
- Row filtering on provider
- pglogical allows provider to provide only certain rows by specifying
row_filter
parameter for thepglogical.replication_set_add_table
function. The row_filter is a normal PostgreSQL expression which has the same limitations on what’s allowed as the CHECK constraint. - Note that both logical replication in PostgreSQL 10 and pglogical can also filter the rows on subscriber using
REPLICA
triggers. - Column filtering
- With pglogical you can give a list of columns to replicate to the function
pglogical.replication_set_add_table
as an optional argumentcolumns
and only the listed columns will be sent out to subscribers. This allows you to replicate only some of the columns in a table. - Conflict detection and resolution
- PostgreSQL 10 can’t detect conflicts of data either coming from multiple sources or when replicated change conflicts with data that was changed locally. So an UPDATE is always applied and INSERT will always fail if there is existing row with same key.
- In contrast, pglogical detects these conflicts and has configurable resolution of these. You can pick if the remote row should be used or if the local one should stay and remote change should be discarded, you can also stop replication on the conflict (and fix the data manually), or let pglogical decide which row to pick based on the timestamp of the transaction that has made the change. pglogical can also convert the
INSERT
operation intoUPDATE
in case of conflicting key. - Version support (cross version replication, and on-line upgrades)
- While it will be possible to replicate from PostgreSQL 10 to PostgreSQL 11 using the logical replication, pglogical supports any version of PostgreSQL since 9.4 so for an existing installation, it can provide a way to do on-line upgrades or replication in heterogeneous environments.
- Sequence replication
- Somewhat related to the on-line upgrades is the support for sequence replication. In pglogical you can add sequences to replication just like you add tables. In PostgreSQL 10 there is no provision for replicating sequences.
- Delayed replication
- In pglogical you can use parameter
apply_delay
topglogical.create_subscription
to delay replication by the given interval. There is no similar functionality yet in built-in logical replication. - Postgres-XL as a subscriber
- Using pglogical you can replicate from regular PostgreSQL to Postgres-XL for example to leverage its MPP architecture for your analytic queries.
- Support for TRUNCATE
- The PostgreSQL 10 built-in logical replication can’t replicate
TRUNCATE
while pglogical can. - Table schema and DDL
- pglogical can copy initial schema of the database while when using built-in logical replication, it’s necessary to create the tables beforehand manually. Also, pglogical provides
pglogical.replicate_ddl_command
function to ensure that DDL statement is replicated to the subscribers at correct place in replication stream (so that the DDL does not break replication). - The built-in PosgreSQL logical replication does not have any provisions to help maintaining the database schema.
- Optional JSON output
- Lesser known feature of pglogical is that it can be also used for integration with rest of the ecosystem. This is achieved by providing optional JSON output for external consumers of he replication stream. The JSON output supports all the features of pglogical including selective replication, row filtering and column filtering.
There is also one thing current release of pglogical (2.1) can’t do and PostgreSQL 10 built-in logical replication can. Using built-in logical replication, it’s possible to replicate UPDATE
and DELETE
for tables which don’t have PRIMARY KEY
using REPLICA IDENTITY FULL
.
Depending on your use-case you can pick and choose between pglogical and in-core Logical Replication. Both are completely covered with 2ndQuadrant’s 24/7 Support and Remote DBA services.
Is the lack of ability to replicate tables without PRIMARY KEYS using REPLICA IDENTITY FULL something that is on the road map?
It’s in the PostgreSQL 10 logical rep, so back-porting it to pglogical should be feasible. It’s all a matter of priorities.
Are features from pglogical 2.1 will be applied to the next release of PostgreSQL 10.x?
Not 10.x those are minor releases, but eventually the features should trickle back to PostgreSQL over few major releases.
Hopefully, some if not all of the new features will be implemented on PostgreSQL 11.
Hi
is there a way to automatically synchronize without executing select pglogical.alter_subscription_resynchronize_table
When I insert data in the table source, I have to execute the select pglogical.alter_subscription_resynchronize_table statment. Otherwise data are not replicated
Thanks for your help
Then there must be something wrong with your replication set configuration, etc. Data should stream if the table is part of a replicated set. That’s the point of the tool.
Hello,
Is the pglogical code used for native PG10 replication, or is it a separate implementation ?
Thanks
It’s a separate implementation that started out based on parts of pglogical.
Is sequence replication will be added to PostgreSQL 11?
Thanks for the breadth of aspect covered. I have a situation where I need to use plogical to establish a filtered ( row based) logical replication between two posgresql instances. My requirement is that on the initial connection only filtered history should be transferred from source db to destination db. Will the “row filtering for provider” solve this ?