2ndQuadrant is now part of EDB

Bringing together some of the world's top PostgreSQL experts.

2ndQuadrant | PostgreSQL
Mission Critical Databases
  • Contact us
  • EN
    • FR
    • IT
    • ES
    • DE
    • PT
  • Support & Services
  • Products
  • Downloads
    • Installers
      • Postgres Installer
      • 2UDA – Unified Data Analytics
    • Whitepapers
      • Business Case for PostgreSQL Support
      • Security Best Practices for PostgreSQL
    • Case Studies
      • Performance Tuning
        • BenchPrep
        • tastyworks
      • Distributed Clusters
        • ClickUp
        • European Space Agency (ESA)
        • Telefónica del Sur
        • Animal Logic
      • Database Administration
        • Agilis Systems
      • Professional Training
        • Met Office
        • London & Partners
      • Database Upgrades
        • Alfred Wegener Institute (AWI)
      • Database Migration
        • International Game Technology (IGT)
        • Healthcare Software Solutions (HSS)
        • Navionics
  • Postgres Learning Center
    • Webinars
      • Upcoming Webinars
      • Webinar Library
    • Whitepapers
      • Business Case for PostgreSQL Support
      • Security Best Practices for PostgreSQL
    • Blog
    • Training
      • Course Catalogue
    • Case Studies
      • Performance Tuning
        • BenchPrep
        • tastyworks
      • Distributed Clusters
        • ClickUp
        • European Space Agency (ESA)
        • Telefónica del Sur
        • Animal Logic
      • Database Administration
        • Agilis Systems
      • Professional Training
        • Met Office
        • London & Partners
      • Database Upgrades
        • Alfred Wegener Institute (AWI)
      • Database Migration
        • International Game Technology (IGT)
        • Healthcare Software Solutions (HSS)
        • Navionics
    • Books
      • PostgreSQL 11 Administration Cookbook
      • PostgreSQL 10 Administration Cookbook
      • PostgreSQL High Availability Cookbook – 2nd Edition
      • PostgreSQL 9 Administration Cookbook – 3rd Edition
      • PostgreSQL Server Programming Cookbook – 2nd Edition
      • PostgreSQL 9 Cookbook – Chinese Edition
    • Videos
    • Events
    • PostgreSQL
      • PostgreSQL – History
      • Who uses PostgreSQL?
      • PostgreSQL FAQ
      • PostgreSQL vs MySQL
      • The Business Case for PostgreSQL
      • Security Information
      • Documentation
  • About Us
    • About 2ndQuadrant
    • 2ndQuadrant’s Passion for PostgreSQL
    • News
    • Careers
    • Team Profile
  • Blog
  • Menu Menu
You are here: Home1 / Blog2 / Gabriele's PlanetPostgreSQL3 / How monitoring of WAL archiving improves with PostgreSQL 9.4 and pg_st...
Gabriele Bartolini

How monitoring of WAL archiving improves with PostgreSQL 9.4 and pg_stat_archiver

January 23, 2015/1 Comment/in Gabriele's PlanetPostgreSQL /by Gabriele Bartolini

PostgreSQL 9.4 introduces a new statistic in the catalogue, called pg_stat_archiver.
Thanks to the SQL language it is now possible, in an instant, to check the state of the archiving process of transactional logs (WALs), crucial component of a PostgreSQL disaster recovery system.

PostgreSQL and Barman

Introduction and reasons

The need for the pg_stat_archiver view comes from the last few years of experience with Barman as a solution for disaster recovery of PostgreSQL databases in business continuity environments.

In particular, some of the needs and recurring questions that DBAs, system administrators, CTOs and CIOs repeatedly – and legimitately – share with us, are:

  • how much disk space will I need?
  • how can I keep the whole backup process, including continuous archiving, under control?

The starting point can only be a business requirement and it is defined through the concept of retention policy.
Usually, a company defines a disaster recovery plan within a business continuity plan, where it is clearly defined the period of retention of backup data. In the same documents we find both the recovery point objective (RPO) and the recovery time objective (RTO) definitions, the two key metrics that respectively measure the amount of data that a business can afford to lose and the maximum allowed time to recover from a disaster.
Monitoring and studying the past behaviour of a database are important elements for correctly sizing the storage requirements of a PostgreSQL backup solution.

For example, a company may decide to retain data of a PostgreSQL database for a month, in order to reconstruct the state of the database in a consistent manner at any point in time from the first available backup to the last available WAL file (through the rock solid Point-in-Time-Recovery technology of PostgreSQL, introduced 10 years ago).

The required size is given not only by the number of periodic full backups (for example, one a week), but also the number of WAL files stored in the Barman archive, each containing all the transactions that have been properly executed in the database, in a differential manner.

One of the needed metrics is therefore the number of archived WAL files per second, through which it is possible to estimate the number of WALs generated in a week and to predict disk usage requirements.

Before PostgreSQL 9.4, unless you:

  1. used tools for sampling and trending (e.g. Munin), or
  2. examined the timestamp of each WAL file, taking advantage of a high value of wal_keep_segments, or
  3. delegated this information to a custom script invoked by archive_command,

it was not possible to get any of the following pieces of information from the database server: number of archived WAL files, timestamp and name of the last archived WAL file, number of failures, timestamp and WAL name of the last failure, … and so on.

For this reason, last year I decided to write a small patch for Postgres which then became part of the core for version 9.4. This patch adds a real-time statistic in the PostgreSQL catalogue, called pg_stat_archiver.

Statistics overview

The pg_stat_archiver catalogue view in PostgreSQL 9.4 makes available to Barman users – and from a more general point of view to everyone using classic continuous backup with WAL file shipping – the following fields:

  • archived_count: number of WAL files successfully archived;
  • last_archived_wal: name of the last successfully archived WAL file;
  • last_archived_time: timestamp of the last successfully archived WAL file;
  • failed_count: number of failed WAL archiving attempts;
  • last_failed_wal: WAL name of the last archiving failure;
  • last_failed_time: timestamp of the last archiving failure;
  • stats_reset : timestamp of the last reset of statistics.

Here is an example taken from a local database server (with very low workload):

postgres=# SELECT * FROM pg_stat_archiver;
-[ RECORD 1 ]------+-----------------------------------------
archived_count     | 17
last_archived_wal  | 00000001000000000000000B.00000028.backup
last_archived_time | 2014-12-23 08:40:17.858291+01
failed_count       | 13
last_failed_wal    | 000000010000000000000001
last_failed_time   | 2014-12-04 13:09:07.348307+01
stats_reset        | 2014-12-03 16:52:21.755025+01

Basically, once continuous archiving is activated through archive_mode, PostgreSQL becomes responsible that, for each WAL file that is generated, the archive_command program is successfully executed, re-trying indefinitely in case of error (disk space permitting).

Unlike previous versions, PostgreSQL 9.4 is now able to collect some information regarding the two main events of the archiving process: success and failure. In particular, for both operations, the catalogue view reports:

  • count (since cluster initialisation or the last time statistics were reset);
  • WAL name of the last operation;
  • time of the last operation.

Moreover, you can reset the statistics of the archiver with the following SQL statement:

-- Requires superuser privileges
SELECT pg_stat_reset_shared('archiver');

Integration with Barman

From version 1.4 onwards, Barman automatically takes advantage of the pg_stat_archiver view for PostgreSQL 9.4 databases, transparently reporting information regarding the archiver process in some common informative commands such as status and show-server.

Furthermore, using the power of the SQL language, reliability of the check command has been improved so that an archiver problem is detected directly from the source.

Let’s walk through the query that has been introduced in Barman:

SELECT *,
    current_setting('archive_mode')::BOOLEAN
        AND (last_failed_wal IS NULL
            OR last_failed_wal <= last_archived_wal)
        AS is_archiving,
    CAST (archived_count AS NUMERIC)
        / EXTRACT (EPOCH FROM age(now(), stats_reset))
        AS current_archived_wals_per_second
FROM pg_stat_archiver

In addition to retrieving all columns of the pg_stat_archiver view, the query calculates two fields, directly from the source:

  • is_archiving: is the process of WAL archiving in progress or not?
  • current_archived_wals_per_second: frequency of archived WALs per second.

The is_archiving field must be TRUE, as the process of archiving is necessary for Barman to work correctly. Therefore, archive_mode must be active and the value of the last failed WAL must be either undefined (NULL) or not higher than the last properly archived WAL. This control is now part of the barman check command for Postgres 9.4 servers (and subsequent versions).

The second field, on the other hand, returns a very interesting statistic on the workload produced by the Postgres server. This metric allows operation managers to estimate the disk space that is required to store days, weeks and months of WAL files (even compressed), thus responding to one of the major initial questions.

How to check the operating status of archiving

Thanks to pg_stat_archiver, checking the status of WAL continuous archiving comes down to the execution of a single SQL query.

As a result, you can use the above query, already used by Barman, to verify that archiving is correctly working and integrate it in the probes and plugins used by your company’s alerting system.

Important Those using Barman with a PostgreSQL 9.4 server that have already integrated barman check within Nagios or Icinga will transparently enjoy this feature.

Conclusions

In its simplicity, the pg_stat_archiver view is a very important tool for those who consider disaster recovery a critical component – not a peripheral one – of a PostgreSQL database in a business continuity system.

Although PostgreSQL allows you to perform backups using streaming replication, continuous archiving of WAL files through shipping is a very safe and reliable fallback method (and still the only one so far supported by Barman). Therefore, proper monitoring of this component significantly increases the robustness of the entire Postgres database solution.

Finally, being able to access some statistics about the number of archived WAL files by a PostgreSQL server in a given period through a simple SQL query is an important step towards the analysis of transactional workload and the prediction of disk usage for a backup solution.

Tags: 9.4, Barman, Business Continuity, continuous archiving, continuous backup, disaster recovery, icinga, monitoring, nagios, pg_stat_archiver, pgbarman, pitr, point-in-time-recovery, postgres, PostgreSQL, retention policies, retention policy, RPO, RTO, wal, xlog
Share this entry
  • Share on Facebook
  • Share on Twitter
  • Share on WhatsApp
  • Share on LinkedIn
1 reply

Trackbacks & Pingbacks

  1. What’s new about Barman 1.4.0? | 2ndQuadrant says:
    March 9, 2015 at 10:31 am

    […] the new features  introduced by PostgreSQL 9.4 is the view pg_stat_archiver that provides useful information regarding the operating status of the WAL storage process. Thanks […]

    Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

Get in touch with us!

Recent Posts

  • Random Data December 3, 2020
  • Webinar: COMMIT Without Fear – The Beauty of CAMO [Follow Up] November 13, 2020
  • Full-text search since PostgreSQL 8.3 November 5, 2020
  • Random numbers November 3, 2020
  • Webinar: Best Practices for Bulk Data Loading in PostgreSQL [Follow Up] November 2, 2020

Featured External Blogs

Tomas Vondra's Blog

Our Bloggers

  • Simon Riggs
  • Alvaro Herrera
  • Andrew Dunstan
  • Craig Ringer
  • Francesco Canovai
  • Gabriele Bartolini
  • Giulio Calacoci
  • Ian Barwick
  • Marco Nenciarini
  • Mark Wong
  • Pavan Deolasee
  • Petr Jelinek
  • Shaun Thomas
  • Tomas Vondra
  • Umair Shahid

PostgreSQL Cloud

2QLovesPG 2UDA 9.6 backup Barman BDR Business Continuity community conference database DBA development devops disaster recovery greenplum Hot Standby JSON JSONB logical replication monitoring OmniDB open source Orange performance PG12 pgbarman pglogical PG Phriday postgres Postgres-BDR postgres-xl PostgreSQL PostgreSQL 9.6 PostgreSQL10 PostgreSQL11 PostgreSQL 11 PostgreSQL 11 New Features postgresql repmgr Recovery replication security sql wal webinar webinars

Support & Services

24/7 Production Support

Developer Support

Remote DBA for PostgreSQL

PostgreSQL Database Monitoring

PostgreSQL Health Check

PostgreSQL Performance Tuning

Database Security Audit

Upgrade PostgreSQL

PostgreSQL Migration Assessment

Migrate from Oracle to PostgreSQL

Products

HA Postgres Clusters

Postgres-BDR®

2ndQPostgres

pglogical

repmgr

Barman

Postgres Cloud Manager

SQL Firewall

Postgres-XL

OmniDB

Postgres Installer

2UDA

Postgres Learning Center

Introducing Postgres

Blog

Webinars

Books

Videos

Training

Case Studies

Events

About Us

About 2ndQuadrant

What does 2ndQuadrant Mean?

News

Careers 

Team Profile

© 2ndQuadrant Ltd. All rights reserved. | Privacy Policy
  • Twitter
  • LinkedIn
  • Facebook
  • Youtube
  • Mail
The CHECK clause for updatable views Incremental backup with Barman 1.4.0
Scroll to top
×