Postgres has a lot of built-in information functions that most people don’t know about. Some of these are critical components to identifying replication lag, but what if we could use them for something else, like throughput?
This man’s one simple trick can track actual database throughput; DBAs hate him!
Let’s take a look at a common query we might use to track replication lag between a Postgres 11 Primary and one or more Replica nodes.
SELECT client_addr, pg_wal_lsn_diff( pg_current_wal_lsn(), sent_lsn ) AS sent_lag, pg_wal_lsn_diff( pg_current_wal_lsn(), write_lsn ) AS write_lag, pg_wal_lsn_diff( pg_current_wal_lsn(), flush_lsn ) AS flush_lag, pg_wal_lsn_diff( pg_current_wal_lsn(), replay_lsn ) AS replay_lag FROM pg_stat_replication;
This gives us the number of bytes the Primary has transmitted to remote servers, given a number of different metrics. How much have we sent? Has that data been written to disk? How much has been synced and is thus safe from crashes? How much has actually been replayed on the Postgres data files? For more information on the
pg_stat_replication view, check out the documentation.
If we were using slots instead, we might do this:
SELECT slot_name, pg_wal_lsn_diff( pg_current_wal_lsn(), restart_lsn ) as restart_lag, pg_wal_lsn_diff( pg_current_wal_lsn(), confirmed_flush_lsn ) as flush_lag FROM pg_replication_slots;
There is somewhat less information in
pg_replication_slots, as it only really tells us the minimum LSN position the Primary needs to retain for that replica. Of course, if that stops advancing while the Primary keeps working, that’s lag we can see even while the replica is disconnected.
This is especially important with replication slots, since they necessarily mean the Primary is retaining WAL files a replica may need. We don’t want the replay lag value to get too high, or the primary node could run out of disk space. That is definitely something we want to monitor.
That’s How it Goes
It’s also a hint as to our database throughput. If we "pin" the value by disconnecting a replica, the value will continue to increase at the same rate the Primary node produces WAL traffic.
But wait a minute; what good is database throughput? It must be pretty useful, as developers have included several indicators in the
pg_stat_database view. That view will tell us how many inserts, updates, and deletes a database may have handled, as well as commit versus rollback activity.
Despite all that (and much more), this statistical view won’t tell us anything about data volume, plus it isn’t persistent, and replicas maintain their own stats independently. What if there was a way to determine how active our Primary database is forever, from any node, any time we can take two readings separated by time?
Ah Give or Take a Night or Two
Consider a brand new fresh instance that has just been bootstrapped with
initdb. It has processed essentially no transactions, contains no tables, and isn’t currently doing anything. Had we known we’d want to track overall instance activity, we may have called
pg_current_wal_lsn() to retrieve the starting LSN.
SELECT pg_current_wal_lsn(); pg_current_wal_lsn -------------------- 0/1652EC0
In theory, we could use this value on any future call to
pg_wal_lsn_diff to obtain the amount of bytes produced by the database instance since its inception. Unfortunately, it’s hardly ever the case we have such an opportunity unless we created the Postgres instance ourselves.
But what is really the difference between
0/1652EC0, and the absolute zero value of
SELECT pg_wal_lsn_diff('0/1652EC0', '0/0'); pg_wal_lsn_diff ----------------- 23408320
That’s not even 24MB, which is less than two 16MB WAL files. In that case, we can safely substitute
0/0 and suddenly any execution of
pg_current_wal_lsn is trivially transformed into an absolute amount of data produced by our Postgres instance.
And a long stem rose
Let’s initialize a medium 100-scale
pgbench instance and check the amount of bytes reported at the end.
SELECT a.total_bytes, (a.total_bytes/1024/1024)::BIGINT AS mbytes FROM pg_wal_lsn_diff(pg_current_wal_lsn(), '0/0') a (total_bytes); total_bytes | mbytes -------------+-------- 1328940592 | 1267
So Postgres has handled about 1267 MB of data since we started. How about after processing a one-minute pgbench test?
total_bytes | mbytes -------------+-------- 1422082032 | 1356
What if we add another index to the
CREATE INDEX idx_account_balance ON pgbench_accounts (abalance); total_bytes | mbytes -------------+-------- 1626342784 | 1551
We included that last example to illustrate that this really measures WAL traffic throughput, thus anything that produces WAL traffic will drive the value higher. It’s not just writes to the instance, but anything that modifies its contents. Even read-only transactions will slowly drive this value forward, as these require various database resources including transaction identifiers.
Is there a better cumulative indicator for how much real actual work a Postgres instance is performing?
Before it Blows
Real work is the key to this metric. We could send the query to any number of monitoring tools, and so long as they can compare previous values across time, we now have a total activity chart for our Postgres instance.
Did something perform a large bulk insert? Is client activity producing a steady stream of data and transaction traffic that’s threatening to overwhelm our hardware? Do we need to reevaluate resource allocation to prevent such throughput spikes from negatively impacting overall performance?
These are all questions the knowledge of Postgres throughput can provide. If the graphs show an upward trend, we can even plan for future expansions. It’s a cool piece of data that few, if any, existing monitoring scripts such as
check_postgres bother to integrate.
Being proactive is always preferable to the alternative, so adding yet another query to our monitoring toolkit is always a step forward. Don’t forget to watch those charts.
And everybody knows that it’s now or never.