During the Postgres Open 2017 conference in San Francisco, someone came to the 2ndQuadrant booth and struck up a conversation with me. During our shameless geeking out over database mechanics, he asked me if pglogical supported the new Postgres 10 partitions. Given my noted expertise in all things Postgres, I answered in the appropriate manner:
"I have no idea. I’ll have to look into that."
Well, after a bit of experimentation, I have a more concrete answer, and it’s reassuringly positive.
Given a table on a provider node, is it possible to capture only INSERT traffic such that it accumulates on a subscribed system for archival purposes? It’s a fairly common tactic, and allows an active OLTP system to regularly purge old data, while a reporting OLAP system keeps it available in posterity.
To get this experiment going, it’s necessary to begin with a regular table that might fit this model.
CREATE TABLE sensor_log ( id SERIAL PRIMARY KEY NOT NULL, location VARCHAR NOT NULL, reading BIGINT NOT NULL, reading_date TIMESTAMP NOT NULL ); INSERT INTO sensor_log (location, reading, reading_date) SELECT s.id % 1000, round(random() * 100), CURRENT_DATE + INTERVAL '1d' - ((s.id * 10)::TEXT || 's')::INTERVAL FROM generate_series(1, 1000000) s(id); CREATE EXTENSION pglogical; SELECT pglogical.create_node( node_name := 'prod_sensors', dsn := 'host=localhost port=5434 dbname=phriday' ); SELECT pglogical.create_replication_set( set_name := 'logging', replicate_insert := TRUE, replicate_update := FALSE, replicate_delete := FALSE, replicate_truncate := FALSE ); SELECT pglogical.replication_set_add_table( set_name := 'logging', relation := 'sensor_log', synchronize_data := TRUE );
There’s nothing really surprising here. We create the table, install pglogical, and register the node itself. Next, we create a replication set that captures only INSERT activity. Why just inserts? It’s probably safe to also include UPDATE actions, but for the sake of this demonstration, we have a write-only ledger-style table.
After creating the new replication set, we just need to add any table(s) that fit that insert model. While pglogical provides a
default_insert_only replication set that does this for us, we find it’s generally better to be explicit to avoid any unintended (and unexpected) magic.
Proof of Concept
With the provider properly configured, all that remains is to set up the subscriber node. This is very similar to setting up the provider node: create table, install pglogical, create subscription. We can do that now:
CREATE TABLE sensor_log ( id INT PRIMARY KEY NOT NULL, location VARCHAR NOT NULL, reading BIGINT NOT NULL, reading_date TIMESTAMP NOT NULL ); CREATE EXTENSION pglogical; SELECT pglogical.create_node( node_name := 'sensor_warehouse', dsn := 'host=localhost port=5435 dbname=phriday' ); SELECT pglogical.create_subscription( subscription_name := 'wh_sensor_data', replication_sets := array['logging'], provider_dsn := 'host=localhost port=5434 dbname=phriday' ); SELECT pg_sleep(10); SELECT COUNT(*) FROM sensor_log; count --------- 1000000
Once again, we err on the side of caution and do a couple of things manually that may not necessarily be entirely necessary. By that, we mean manually creating the
sensor_log table on the subsriber node.
create_subscription function has a parameter called
synchronize_structure to skip the table-creation step. On the other hand, it uses
pg_dump to obtain table structure DDL, so the import might fail if recipient database isn’t empty. We can skip that whole dance by not using the parameter at all.
Once we’ve verified the one-million sample rows have transferred, our job is done, right?
Well, almost. There’s still time to be fancy. While we have proven it’s possible to capture only inserted data, Postgres 10 table partitions are still an unknown quantity in this relationship. It turns out, their implementation under the hood is an extremely relevant detail.
To see just how, we need to tear down the subscription and drop the recipient table on the subscriber:
SELECT pglogical.drop_subscription( subscription_name := 'wh_sensor_data' ); DROP TABLE sensor_log;
Don’t worry, our
sensor_log table will be back, and better than ever.
Down the Rabbit Hole
We only inserted one million rows into the provider node’s copy of
sensor_log. As it turns out, the dates we generated don’t even exit 2017. That’s fine though, because with Postgres 10 partitions, even a single partition is sufficient to demonstrate the process.
Let’s start with a single table partitioned by the reading_date column:
CREATE TABLE sensor_log ( id SERIAL, location VARCHAR NOT NULL, reading BIGINT NOT NULL, reading_date TIMESTAMP NOT NULL ) PARTITION BY RANGE (reading_date); CREATE TABLE sensor_log_part_2017 PARTITION OF sensor_log FOR VALUES FROM ('2017-01-01') TO ('2018-01-01'); CREATE UNIQUE INDEX udx_sensor_log_2017_sensor_log_id ON sensor_log_part_2017 (sensor_log_id); CREATE INDEX idx_sensor_log_2017_location ON sensor_log_part_2017 (location); CREATE INDEX idx_sensor_log_2017_date ON sensor_log_part_2017 (reading_date);
We lament the inability to use the LIKE syntax to copy any placeholder indexes on the root table, but maybe that’ll show up in Postgres 11 or 12. Regardless, we now have one partitioned table backed by a single partition.
Staring into the Abyss
This is where the fun starts! All we need to do is recreate the subscription, and the
sensor_log table data should be redirected into the 2017 partition, thus proving pglogical works with Postgres 10 partitions.
Let’s try it out:
SELECT pglogical.create_subscription( subscription_name := 'wh_sensor_data', replication_sets := array['logging'], provider_dsn := 'host=localhost port=5434 dbname=phriday' ); SELECT pg_sleep(10); SELECT COUNT(*) FROM sensor_log; count ------- 0 SELECT pglogical.drop_subscription( subscription_name := 'wh_sensor_data' );
Wait, what’s going on here? Why isn’t the table being copied at all? Let’s see what the logs have to say…
2017-09-18 14:36:03.065 CDT  LOG: starting receiver for subscription wh_sensor_data 2017-09-18 14:36:03.111 CDT  ERROR: pglogical target reation "public.sensor_log" is not a table
The Abyss Stares Back
It just so happens that Postgres partitioned tables aren’t actually tables. They’re more of a table-like structure that allow certain database operations to target the underlying partitions. We can even see this for ourselves by checking out the
pg_class system catalog table:
SELECT relname, relkind FROM pg_class WHERE relname LIKE 'sensor_log%'; relname | relkind ----------------------+--------- sensor_log | p sensor_log_id_seq | S sensor_log_part_2017 | r
relkind column tells us which type of object we’re looking at. Normal tables in Postgres are usually marked ‘r’ for relation. The
sensor_log table on the subscriber however, shows ‘p’ for partitioned table. That actually matters, because only relations can store data. When pglogical sees that the partitioned table isn’t a relation, it refuses to continue.
Pglogical’s decision to refuse to insert into
sensor_log isn’t unique. Had we attempted this experiment with Postgres 10’s new PUBLICATION / SUBSCRIPTION logical replication system, we would get the same result. Not even Postgres 10’s built-in logical replication is compatible with partitioned tables; they’re just too new.
A Way Out
Despite implementation details causing a bit of a non-intuitive roadblock, there’s a way around this: we cheat. Unlike Postgres 10’s built-in logical replication, pglogical exposes advanced API hooks. One of those is the Postgres Server Programming Interface.
The default behavior of logical decoding is to try and match the Postgres internal objects to prevent structural incompatibilities. As such, it matters that
sensor_log isn’t a relation; it’s ultimately ephemeral, and can’t store the same data.
But what if pglogical could convert the logical decoding into literal INSERT statements instead? Well, the pglogical documentation tells us we can do that by setting these parameters in
pglogical.conflict_resolution = false pglogical.use_spi = true
The first disables conflict resolution. We don’t really need that on the subscriber, since it’s simply receiving a stream of inserts. Then we enable the SPI process which converts the logical decoding directly into actual INSERT statements.
Back to Reality
If we try the subscription again, we should see our expected result:
SELECT pglogical.create_subscription( subscription_name := 'wh_sensor_data', replication_sets := array['logging'], provider_dsn := 'host=localhost port=5434 dbname=phriday' ); SELECT pg_sleep(10); SELECT COUNT(*) FROM sensor_log; count --------- 1000000 SELECT COUNT(*) FROM sensor_log_part_2017; count --------- 1000000
So not only has the partition system worked, we didn’t need any triggers as with previous attempts to implement this model. Postgres 10 worked as advertised, and it’s still a beta build at the time of this writing.
While it’s unfortunate we had to jump through a couple of odd hoops to reach our intended destination, we still arrived intact. What’s more, we can see that though Postgres 10 does offer internal logical replication, it’s still an evolving feature that isn’t quite complete yet.
Postgres 11, 12, and future versions will slowly fill those cracks as patches are incorporated. In the meantime, pglogical will continue to leverage the EXTENSION system to add advanced features that Postgres core isn’t quite ready to absorb. And indeed, redirecting logical replication into a partition is a somewhat advanced use case.
I’ve always loved the Postgres EXTENSION system; augmenting Postgres functionality with things like pglogical ensures that even difficult edge cases often have a workable solution.