2ndQuadrant is now part of EDB

Bringing together some of the world's top PostgreSQL experts.

2ndQuadrant | PostgreSQL
Mission Critical Databases
  • Contact us
  • EN
    • FR
    • IT
    • ES
    • DE
    • PT
  • Support & Services
  • Products
  • Downloads
    • Installers
      • Postgres Installer
      • 2UDA – Unified Data Analytics
    • Whitepapers
      • Business Case for PostgreSQL Support
      • Security Best Practices for PostgreSQL
    • Case Studies
      • Performance Tuning
        • BenchPrep
        • tastyworks
      • Distributed Clusters
        • ClickUp
        • European Space Agency (ESA)
        • Telefónica del Sur
        • Animal Logic
      • Database Administration
        • Agilis Systems
      • Professional Training
        • Met Office
        • London & Partners
      • Database Upgrades
        • Alfred Wegener Institute (AWI)
      • Database Migration
        • International Game Technology (IGT)
        • Healthcare Software Solutions (HSS)
        • Navionics
  • Postgres Learning Center
    • Webinars
      • Upcoming Webinars
      • Webinar Library
    • Whitepapers
      • Business Case for PostgreSQL Support
      • Security Best Practices for PostgreSQL
    • Blog
    • Training
      • Course Catalogue
    • Case Studies
      • Performance Tuning
        • BenchPrep
        • tastyworks
      • Distributed Clusters
        • ClickUp
        • European Space Agency (ESA)
        • Telefónica del Sur
        • Animal Logic
      • Database Administration
        • Agilis Systems
      • Professional Training
        • Met Office
        • London & Partners
      • Database Upgrades
        • Alfred Wegener Institute (AWI)
      • Database Migration
        • International Game Technology (IGT)
        • Healthcare Software Solutions (HSS)
        • Navionics
    • Books
      • PostgreSQL 11 Administration Cookbook
      • PostgreSQL 10 Administration Cookbook
      • PostgreSQL High Availability Cookbook – 2nd Edition
      • PostgreSQL 9 Administration Cookbook – 3rd Edition
      • PostgreSQL Server Programming Cookbook – 2nd Edition
      • PostgreSQL 9 Cookbook – Chinese Edition
    • Videos
    • Events
    • PostgreSQL
      • PostgreSQL – History
      • Who uses PostgreSQL?
      • PostgreSQL FAQ
      • PostgreSQL vs MySQL
      • The Business Case for PostgreSQL
      • Security Information
      • Documentation
  • About Us
    • About 2ndQuadrant
    • 2ndQuadrant’s Passion for PostgreSQL
    • News
    • Careers
    • Team Profile
  • Blog
  • Menu Menu
You are here: Home1 / Blog2 / Gabriele's PlanetPostgreSQL3 / Backup manifests and pg_verifybackup in PostgreSQL 13
Backup manifests and pg_verifybackup in PostgreSQL 13
Gabriele Bartolini

Backup manifests and pg_verifybackup in PostgreSQL 13

May 4, 2020/0 Comments/in Gabriele's PlanetPostgreSQL /by Gabriele Bartolini

PostgreSQL 13 introduces two useful features to enhance automated validation of physical backups: backup manifests and a new tool called pg_verifybackup. In this short article I attempt to provide an overview of them.

An elephant walking. Photogravure after Eadweard Muybridge, Wellcome V0048775

As you probably know if you are familiar with PostgreSQL and the PostgreSQL Community, I have been promoting business continuity and disaster recovery awareness in organisations for many years, including the importance of taking backups as well as systematically testing and measuring the recovery process.

About 10 years ago, I also came up with the idea to write a backup tool for PostgreSQL databases called Barman, with my teammates at 2ndQuadrant. It is probably one of the most used backup tools in the PostgreSQL panorama.

However, one of the features that Barman still lacks is the possibility to verify the validity of a backup without having to restore it first – which, as I said, should be done systematically and automatically (given that Barman allows you to do it through recovery hook scripts).
In any case, PostgreSQL itself is taking an important step towards this direction. The next major release at the time of writing, PostgreSQL 13, expected for the end of 2020, will include two important new features:

  • backup manifests, listing the content taken by a physical backup using pg_basebackup
  • pg_verifybackup, a tool to verify that the content of a backup matches the given manifest

Backup manifests

When taking a full backup with pg_basebackup a backup manifest is automatically created, by default. If you are using the plain format (“-F p”), a file called backup_manifest will be included in the backup of the PGDATA. This applies also if you have tablespaces, which need to be properly remapped to be backed up. If you are using the tar format (“-F t”), the backup_manifest file will be placed in the destination directory, alongside each tar file produced by pg_basebackup (one for the PGDATA, one for the WALs, if requested, and one for each tablespace if applicable).

What is the content of the backup_manifest file? It is nothing but a JSON object, with the following keys:

  • PostgreSQL-Backup-Manifest-Version: the version of the manifest, currently 1
  • Files: the list of files included in the backup and, for each of them, the relative path from PGDATA and important metadata such as size, time of last modification and checksum
  • WAL-Ranges: information such as timeline, LSN of backup start and LSN of backup end
  • Manifest-Checksum: the checksum of the manifest file

pg_basebackup comes with several options to control the manifest, including disabling it. You can decide to encode filenames in the manifest, or to change the checksum algorithm (by default CRC32).

You can also decide to take a standalone backup (which includes all WAL files from the start to the end of the backup) or just the base backup as WALs are archived separately. The latter is important in a future integration with backup tools like Barman that have an independent management policy of WAL files in which they can be compressed, encrypted or archived in remote locations such as object stores in private/public clouds.

I tried taking just a base backup, without WALs with the following command:

pg_basebackup -X none -D ~/backups/$(date '+%s')

Note: For the sake of simplicity, I am using my home directory to store these examples of backups. Then rely on $(date '+%s') to create different directories per backup.

The content of the backed up directory is:

total 184
drwx------ 26 gabriele staff 832 May 2 01:12 .
drwx------ 3 gabriele staff 96 May 2 01:12 ..
-rw------- 1 gabriele staff 3 May 2 01:12 PG_VERSION
-rw------- 1 gabriele staff 227 May 2 01:12 backup_label
-rw------- 1 gabriele staff 135125 May 2 01:12 backup_manifest
drwx------ 5 gabriele staff 160 May 2 01:12 base
drwx------ 59 gabriele staff 1888 May 2 01:12 global
drwx------ 2 gabriele staff 64 May 2 01:12 pg_commit_ts
drwx------ 2 gabriele staff 64 May 2 01:12 pg_dynshmem
-rw------- 1 gabriele staff 4513 May 2 01:12 pg_hba.conf
-rw------- 1 gabriele staff 1636 May 2 01:12 pg_ident.conf
drwx------ 5 gabriele staff 160 May 2 01:12 pg_logical
drwx------ 4 gabriele staff 128 May 2 01:12 pg_multixact
drwx------ 2 gabriele staff 64 May 2 01:12 pg_notify
drwx------ 2 gabriele staff 64 May 2 01:12 pg_replslot
drwx------ 2 gabriele staff 64 May 2 01:12 pg_serial
drwx------ 2 gabriele staff 64 May 2 01:12 pg_snapshots
drwx------ 2 gabriele staff 64 May 2 01:12 pg_stat
drwx------ 2 gabriele staff 64 May 2 01:12 pg_stat_tmp
drwx------ 2 gabriele staff 64 May 2 01:12 pg_subtrans
drwx------ 2 gabriele staff 64 May 2 01:12 pg_tblspc
drwx------ 2 gabriele staff 64 May 2 01:12 pg_twophase
drwx------ 3 gabriele staff 96 May 2 01:12 pg_wal
drwx------ 3 gabriele staff 96 May 2 01:12 pg_xact
-rw------- 1 gabriele staff 88 May 2 01:12 postgresql.auto.conf
-rw------- 1 gabriele staff 28106 May 2 01:12 postgresql.conf

As you can see, there is a new file called backup_manifest containing the summary of the backup in JSON format:

{ "PostgreSQL-Backup-Manifest-Version": 1,
"Files": [
{ "Path": "backup_label", "Size": 227, "Last-Modified": "2020-05-01 23:12:15 GMT", "Checksum-Algorithm": "CRC32C", "Checksum": "14aa4bcb" },
{ "Path": "pg_multixact/members/0000", "Size": 8192, "Last-Modified": "2020-05-01 10:02:38 GMT", "Checksum-Algorithm": "CRC32C", "Checksum": "23464490" },
{ "Path": "pg_multixact/offsets/0000", "Size": 8192, "Last-Modified": "2020-05-01 10:05:40 GMT", "Checksum-Algorithm": "CRC32C", "Checksum": "23464490" },
{ "Path": "PG_VERSION", "Size": 3, "Last-Modified": "2020-05-01 10:02:38 GMT", "Checksum-Algorithm": "CRC32C", "Checksum": "b825884b" },
…
],
"WAL-Ranges": [
{ "Timeline": 1, "Start-LSN": "0/21000028", "End-LSN": "0/21000100" }
],
"Manifest-Checksum": "fae6b7aa9eaab0a29474c7281a533ec2154f0f9fb8fd1e14b879c31f22bd62eb"}

Verify the backup with pg_verifybackup

I now try to run pg_verifybackup with default options:

freddie:backups gabriele$ pg_verifybackup ~/backups/1588374735/
pg_waldump: fatal: could not find any WAL file
pg_verifybackup: error: WAL parsing failed for timeline 1

As you can see, pg_verifybackup expects to find also the WAL files from the start to the end of the backup, which I deliberately did not include in the backup. I can skip verification of the WAL files with the "-n” option:

freddie:backups gabriele$ pg_verifybackup -n ~/backups/1588374735/
backup successfully verified

I now take a standalone backup that includes all WAL files from start to end of the backup in the pg_wal directory inside PGDATA:

pg_basebackup -D ~/backups/$(date '+%s')

I want to simulate a corruption of a WAL file, so I open one of them and change a byte. Then run pg_verifybackup:

pg_waldump: fatal: error in WAL record at 0/2A0011D0: incorrect resource manager data checksum in record at 0/2A001218
pg_verifybackup: error: WAL parsing failed for timeline 1

As you can see, pg_verifybackup relies on pg_waldump to verify that the stream of WAL files on the same timeline is present in the backup.

What’s next …

I am extremely grateful to the fellow PostgreSQL developers that wrote support for backup manifests and pg_verifybackup. It is now time to support this in Barman.

Those that use Barman with pg_basebackup will automatically have a backup manifest in their backups. In the next versions of Barman, we need to make sure that rsync-generated backups for PostgreSQL 13 (and later) will produce 100% compatible manifests.

Then we need to integrate pg_verifybackup in Barman, with "-n" mode as WAL files are archived in a compressed way.

One idea could be to add a "fetch-wal" option to pg_verifybackup that invokes an external command, similar to restore_command and requests WAL files directly to "barman-wal-restore" or  to the "barman get-wal" command. Something to bring up for PostgreSQL 14.

Image: “An elephant walking. Photogravure after Eadweard Muybridge”, 1887.
Tags: backup, backup manifests, Barman, pg_basebackup, pg_verifybackup, pg13, pgbarman, postgres, PostgreSQL, postgresql 13, validation, verification
Share this entry
  • Share on Facebook
  • Share on Twitter
  • Share on WhatsApp
  • Share on LinkedIn
0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

Get in touch with us!

Recent Posts

  • Random Data December 3, 2020
  • Webinar: COMMIT Without Fear – The Beauty of CAMO [Follow Up] November 13, 2020
  • Full-text search since PostgreSQL 8.3 November 5, 2020
  • Random numbers November 3, 2020
  • Webinar: Best Practices for Bulk Data Loading in PostgreSQL [Follow Up] November 2, 2020

Featured External Blogs

Tomas Vondra's Blog

Our Bloggers

  • Simon Riggs
  • Alvaro Herrera
  • Andrew Dunstan
  • Craig Ringer
  • Francesco Canovai
  • Gabriele Bartolini
  • Giulio Calacoci
  • Ian Barwick
  • Marco Nenciarini
  • Mark Wong
  • Pavan Deolasee
  • Petr Jelinek
  • Shaun Thomas
  • Tomas Vondra
  • Umair Shahid

PostgreSQL Cloud

2QLovesPG 2UDA 9.6 backup Barman BDR Business Continuity community conference database DBA development devops disaster recovery greenplum Hot Standby JSON JSONB logical replication monitoring OmniDB open source Orange performance PG12 pgbarman pglogical PG Phriday postgres Postgres-BDR postgres-xl PostgreSQL PostgreSQL 9.6 PostgreSQL10 PostgreSQL11 PostgreSQL 11 PostgreSQL 11 New Features postgresql repmgr Recovery replication security sql wal webinar webinars

Support & Services

24/7 Production Support

Developer Support

Remote DBA for PostgreSQL

PostgreSQL Database Monitoring

PostgreSQL Health Check

PostgreSQL Performance Tuning

Database Security Audit

Upgrade PostgreSQL

PostgreSQL Migration Assessment

Migrate from Oracle to PostgreSQL

Products

HA Postgres Clusters

Postgres-BDR®

2ndQPostgres

pglogical

repmgr

Barman

Postgres Cloud Manager

SQL Firewall

Postgres-XL

OmniDB

Postgres Installer

2UDA

Postgres Learning Center

Introducing Postgres

Blog

Webinars

Books

Videos

Training

Case Studies

Events

About Us

About 2ndQuadrant

What does 2ndQuadrant Mean?

News

Careers 

Team Profile

© 2ndQuadrant Ltd. All rights reserved. | Privacy Policy
  • Twitter
  • LinkedIn
  • Facebook
  • Youtube
  • Mail
Webinar: KNN Indexing in PostgreSQL [Follow Up] Webinar: KNN Indexing in PostgreSQL [Follow Up] Security and Compliance with PostgreSQL [Webinar] Security and Compliance with PostgreSQL [Webinar]
Scroll to top
×