PG Phriday: pglogical and Postgres 10 Partitions
During the Postgres Open 2017 conference in San Francisco, someone came to the 2ndQuadrant booth and struck up a conversation with me. During our shameless geeking out over database mechanics, he asked me if pglogical supported the new Postgres 10 partitions. Given my noted expertise in all things Postgres, I answered in the appropriate manner:
"I have no idea. I’ll have to look into that."
Well, after a bit of experimentation, I have a more concrete answer, and it’s reassuringly positive.
The Problem
Given a table on a provider node, is it possible to capture only INSERT traffic such that it accumulates on a subscribed system for archival purposes? It’s a fairly common tactic, and allows an active OLTP system to regularly purge old data, while a reporting OLAP system keeps it available in posterity.
To get this experiment going, it’s necessary to begin with a regular table that might fit this model.
CREATE TABLE sensor_log (
id SERIAL PRIMARY KEY NOT NULL,
location VARCHAR NOT NULL,
reading BIGINT NOT NULL,
reading_date TIMESTAMP NOT NULL
);
INSERT INTO sensor_log (location, reading, reading_date)
SELECT s.id % 1000, round(random() * 100),
CURRENT_DATE + INTERVAL '1d' - ((s.id * 10)::TEXT || 's')::INTERVAL
FROM generate_series(1, 1000000) s(id);
CREATE EXTENSION pglogical;
SELECT pglogical.create_node(
node_name := 'prod_sensors',
dsn := 'host=localhost port=5434 dbname=phriday'
);
SELECT pglogical.create_replication_set(
set_name := 'logging',
replicate_insert := TRUE, replicate_update := FALSE,
replicate_delete := FALSE, replicate_truncate := FALSE
);
SELECT pglogical.replication_set_add_table(
set_name := 'logging', relation := 'sensor_log',
synchronize_data := TRUE
);
There’s nothing really surprising here. We create the table, install pglogical, and register the node itself. Next, we create a replication set that captures only INSERT activity. Why just inserts? It’s probably safe to also include UPDATE actions, but for the sake of this demonstration, we have a write-only ledger-style table.
After creating the new replication set, we just need to add any table(s) that fit that insert model. While pglogical provides a default_insert_only
replication set that does this for us, we find it’s generally better to be explicit to avoid any unintended (and unexpected) magic.
Proof of Concept
With the provider properly configured, all that remains is to set up the subscriber node. This is very similar to setting up the provider node: create table, install pglogical, create subscription. We can do that now:
CREATE TABLE sensor_log (
id INT PRIMARY KEY NOT NULL,
location VARCHAR NOT NULL,
reading BIGINT NOT NULL,
reading_date TIMESTAMP NOT NULL
);
CREATE EXTENSION pglogical;
SELECT pglogical.create_node(
node_name := 'sensor_warehouse',
dsn := 'host=localhost port=5435 dbname=phriday'
);
SELECT pglogical.create_subscription(
subscription_name := 'wh_sensor_data',
replication_sets := array['logging'],
provider_dsn := 'host=localhost port=5434 dbname=phriday'
);
SELECT pg_sleep(10);
SELECT COUNT(*) FROM sensor_log;
count
---------
1000000
Once again, we err on the side of caution and do a couple of things manually that may not necessarily be entirely necessary. By that, we mean manually creating the sensor_log
table on the subsriber node.
The create_subscription
function has a parameter called synchronize_structure
to skip the table-creation step. On the other hand, it uses pg_dump
to obtain table structure DDL, so the import might fail if recipient database isn’t empty. We can skip that whole dance by not using the parameter at all.
Once we’ve verified the one-million sample rows have transferred, our job is done, right?
Starting Over
Well, almost. There’s still time to be fancy. While we have proven it’s possible to capture only inserted data, Postgres 10 table partitions are still an unknown quantity in this relationship. It turns out, their implementation under the hood is an extremely relevant detail.
To see just how, we need to tear down the subscription and drop the recipient table on the subscriber:
SELECT pglogical.drop_subscription(
subscription_name := 'wh_sensor_data'
);
DROP TABLE sensor_log;
Don’t worry, our sensor_log
table will be back, and better than ever.
Down the Rabbit Hole
We only inserted one million rows into the provider node’s copy of sensor_log
. As it turns out, the dates we generated don’t even exit 2017. That’s fine though, because with Postgres 10 partitions, even a single partition is sufficient to demonstrate the process.
Let’s start with a single table partitioned by the reading_date column:
CREATE TABLE sensor_log (
id SERIAL,
location VARCHAR NOT NULL,
reading BIGINT NOT NULL,
reading_date TIMESTAMP NOT NULL
)
PARTITION BY RANGE (reading_date);
CREATE TABLE sensor_log_part_2017
PARTITION OF sensor_log
FOR VALUES FROM ('2017-01-01') TO ('2018-01-01');
CREATE UNIQUE INDEX udx_sensor_log_2017_sensor_log_id ON sensor_log_part_2017 (sensor_log_id);
CREATE INDEX idx_sensor_log_2017_location ON sensor_log_part_2017 (location);
CREATE INDEX idx_sensor_log_2017_date ON sensor_log_part_2017 (reading_date);
We lament the inability to use the LIKE syntax to copy any placeholder indexes on the root table, but maybe that’ll show up in Postgres 11 or 12. Regardless, we now have one partitioned table backed by a single partition.
Staring into the Abyss
This is where the fun starts! All we need to do is recreate the subscription, and the sensor_log
table data should be redirected into the 2017 partition, thus proving pglogical works with Postgres 10 partitions.
Let’s try it out:
SELECT pglogical.create_subscription(
subscription_name := 'wh_sensor_data',
replication_sets := array['logging'],
provider_dsn := 'host=localhost port=5434 dbname=phriday'
);
SELECT pg_sleep(10);
SELECT COUNT(*) FROM sensor_log;
count
-------
0
SELECT pglogical.drop_subscription(
subscription_name := 'wh_sensor_data'
);
Wait, what’s going on here? Why isn’t the table being copied at all? Let’s see what the logs have to say…
2017-09-18 14:36:03.065 CDT [4196] LOG: starting receiver for subscription wh_sensor_data
2017-09-18 14:36:03.111 CDT [4196] ERROR: pglogical target reation "public.sensor_log" is not a table
Oh…
The Abyss Stares Back
It just so happens that Postgres partitioned tables aren’t actually tables. They’re more of a table-like structure that allow certain database operations to target the underlying partitions. We can even see this for ourselves by checking out the pg_class
system catalog table:
SELECT relname, relkind
FROM pg_class
WHERE relname LIKE 'sensor_log%';
relname | relkind
----------------------+---------
sensor_log | p
sensor_log_id_seq | S
sensor_log_part_2017 | r
The relkind
column tells us which type of object we’re looking at. Normal tables in Postgres are usually marked ‘r’ for relation. The sensor_log
table on the subscriber however, shows ‘p’ for partitioned table. That actually matters, because only relations can store data. When pglogical sees that the partitioned table isn’t a relation, it refuses to continue.
Pglogical’s decision to refuse to insert into sensor_log
isn’t unique. Had we attempted this experiment with Postgres 10’s new PUBLICATION / SUBSCRIPTION logical replication system, we would get the same result. Not even Postgres 10’s built-in logical replication is compatible with partitioned tables; they’re just too new.
A Way Out
Despite implementation details causing a bit of a non-intuitive roadblock, there’s a way around this: we cheat. Unlike Postgres 10’s built-in logical replication, pglogical exposes advanced API hooks. One of those is the Postgres Server Programming Interface.
The default behavior of logical decoding is to try and match the Postgres internal objects to prevent structural incompatibilities. As such, it matters that sensor_log
isn’t a relation; it’s ultimately ephemeral, and can’t store the same data.
But what if pglogical could convert the logical decoding into literal INSERT statements instead? Well, the pglogical documentation tells us we can do that by setting these parameters in postgresql.conf
:
pglogical.conflict_resolution = false
pglogical.use_spi = true
The first disables conflict resolution. We don’t really need that on the subscriber, since it’s simply receiving a stream of inserts. Then we enable the SPI process which converts the logical decoding directly into actual INSERT statements.
Back to Reality
If we try the subscription again, we should see our expected result:
SELECT pglogical.create_subscription(
subscription_name := 'wh_sensor_data',
replication_sets := array['logging'],
provider_dsn := 'host=localhost port=5434 dbname=phriday'
);
SELECT pg_sleep(10);
SELECT COUNT(*) FROM sensor_log;
count
---------
1000000
SELECT COUNT(*) FROM sensor_log_part_2017;
count
---------
1000000
So not only has the partition system worked, we didn’t need any triggers as with previous attempts to implement this model. Postgres 10 worked as advertised, and it’s still a beta build at the time of this writing.
While it’s unfortunate we had to jump through a couple of odd hoops to reach our intended destination, we still arrived intact. What’s more, we can see that though Postgres 10 does offer internal logical replication, it’s still an evolving feature that isn’t quite complete yet.
Postgres 11, 12, and future versions will slowly fill those cracks as patches are incorporated. In the meantime, pglogical will continue to leverage the EXTENSION system to add advanced features that Postgres core isn’t quite ready to absorb. And indeed, redirecting logical replication into a partition is a somewhat advanced use case.
I’ve always loved the Postgres EXTENSION system; augmenting Postgres functionality with things like pglogical ensures that even difficult edge cases often have a workable solution.
Shaun,
So, you showed workaround that makes pglogical extension work with partitioned tables in PG10 by modifying the following settings in postgresql.conf:
pglogical.conflict_resolution = false
pglogical.use_spi = true
Question: is there a way to make “internal/built-in logical replication” in PG10 to work with partitioned tables?
Regards,
Igor
Hi Igor,
I’m not aware of a method that works with built-in logical replication. Like I mentioned in the article, it’s relatively stripped-down as a first iteration of the functionality. It’s fairly likely Postgres 11 will have that functionality, however.
Hi Shaun,
Is there a workaround to leverage the update tuple case in replication situation for partitioned table in subscription node?
If there are many partitioned tables, it might not be sufficient to locate the desired tuple in which table since the table is not partitioned by the unique primary key, generally.
Thanks for help.
Just a heads up to anyone trying this, with the official release version of PG10, if you follow these instructions exactly, you will see this error on the subscriber: ERROR: pglogical.use_spi can only be used when pglogical.conflict_resolution is set to ‘error’
Thus put the following in your postgresql.conf:
pglogical.conflict_resolution = error
pglogical.use_spi = true
While this will works for normal inserts (INSERT INTO sensor_log VALUES (…)), if, on the master, you had a table called ‘sensor_log_backup’ that had some rows that you wanted to insert into ‘sensor_log’ i.e.:
INSERT INTO sensor_log SELECT * from sensor_log_backup LIMIT 1000;
You will see the following on the subscriber:
ERROR: cross-database references are not implemented: “`.(null).?”
LOG: apply worker [22629] at slot 1 generation 3 exiting with error
LOG: worker process: pglogical apply 108586:1763399739 (PID 22629) exited with exit code 1
So my guess is that pglogical IS able to replay INSERTS, however, if the source of data for the inserts is not something that is also present on the subscriber, it still treats that as a foreign source. Meaning, it won’t translate my INSERT INTO…SELECT query into literal (INSERT INTO…VALUES()) statements.
Good catch, Arjun. That is an unfortunate side-effect. I’ll see if the our pglogical devs can do something about that.
Hi Shaun,
When I started replication from non-partitioned tables (postgresql 9.5 with pglogical 2.2.0) to partitioned tables (postgresql 11 with pglogical 2.2.2) I face the similar error as Arjun:
LOG: starting apply for subscription doc_doc01_007
ERROR: cross-database references are not implemented: “e.(null).”
LOG: apply worker [12596] at slot 1 generation 11 exiting with error
But there was no external tables in queries.
Little investigation shows that this error arises wher apply worker starts ‘multi_insert’ mode.
So I patched pglogical_apply.c (just commented out line “use_multi_insert = true;”) and replication starts without errors.
Hope this information may help somebody.