2ndQuadrant is now part of EDB

Bringing together some of the world's top PostgreSQL experts.

2ndQuadrant | PostgreSQL
Mission Critical Databases
  • Contact us
  • EN
    • FR
    • IT
    • ES
    • DE
    • PT
  • Support & Services
  • Products
  • Downloads
    • Installers
      • Postgres Installer
      • 2UDA – Unified Data Analytics
    • Whitepapers
      • Business Case for PostgreSQL Support
      • Security Best Practices for PostgreSQL
    • Case Studies
      • Performance Tuning
        • BenchPrep
        • tastyworks
      • Distributed Clusters
        • ClickUp
        • European Space Agency (ESA)
        • Telefónica del Sur
        • Animal Logic
      • Database Administration
        • Agilis Systems
      • Professional Training
        • Met Office
        • London & Partners
      • Database Upgrades
        • Alfred Wegener Institute (AWI)
      • Database Migration
        • International Game Technology (IGT)
        • Healthcare Software Solutions (HSS)
        • Navionics
  • Postgres Learning Center
    • Webinars
      • Upcoming Webinars
      • Webinar Library
    • Whitepapers
      • Business Case for PostgreSQL Support
      • Security Best Practices for PostgreSQL
    • Blog
    • Training
      • Course Catalogue
    • Case Studies
      • Performance Tuning
        • BenchPrep
        • tastyworks
      • Distributed Clusters
        • ClickUp
        • European Space Agency (ESA)
        • Telefónica del Sur
        • Animal Logic
      • Database Administration
        • Agilis Systems
      • Professional Training
        • Met Office
        • London & Partners
      • Database Upgrades
        • Alfred Wegener Institute (AWI)
      • Database Migration
        • International Game Technology (IGT)
        • Healthcare Software Solutions (HSS)
        • Navionics
    • Books
      • PostgreSQL 11 Administration Cookbook
      • PostgreSQL 10 Administration Cookbook
      • PostgreSQL High Availability Cookbook – 2nd Edition
      • PostgreSQL 9 Administration Cookbook – 3rd Edition
      • PostgreSQL Server Programming Cookbook – 2nd Edition
      • PostgreSQL 9 Cookbook – Chinese Edition
    • Videos
    • Events
    • PostgreSQL
      • PostgreSQL – History
      • Who uses PostgreSQL?
      • PostgreSQL FAQ
      • PostgreSQL vs MySQL
      • The Business Case for PostgreSQL
      • Security Information
      • Documentation
  • About Us
    • About 2ndQuadrant
    • 2ndQuadrant’s Passion for PostgreSQL
    • News
    • Careers
    • Team Profile
  • Blog
  • Menu Menu
You are here: Home1 / Blog2 / Barman3 / What’s new about Barman 1.4.0?
Giulio Calacoci

What’s new about Barman 1.4.0?

March 9, 2015/2 Comments/in Barman, Giulio's PlanetPostgreSQL /by Giulio Calacoci

The 1.4.0 version of Barman adds new features such as incremental backup and automatic integration with pg_stat_archiver which aim to simplify the life of DBAs and system administrators.

Barman 1.4.0: the most important changestazz.verticale

The latest release introduces  a new backup mode, the inc remental backup. This mode allows the reuse of unmodified files between one periodic backup and another, drastically reducing the execution times, bandwidth used and disk space taken up. Another new feature is the integration of Barman with the view pg_stat_archiver, available from version 9.4 of PostgreSQL. The view allows information on the performance of the WAL storage to be collected and to monitor the status of the process. Management of the WAL files has been improved. Calculation of storage statistics has been streamlined and optimised. The logic of  the removal of obsolete WAL has been improved, performing different actions in the event of exclusive or concurrent backups. Error messages have been improved, making them clearer and more legible where possible. We have also invested in the robustness of the code: with the 1.4.0 release we have approximately 200 unit tests that are performed with every patch.

Incremental backup

Let’s explore the main innovation of this release: the incremental backup.

Definition and basic theory

To understand the logic on which the incremental backup is based, let’s consider two complete and consecutive backups. In the time interval between completion of the first backup and completion of the subsequent backup, not all the files contained within the PGDATA directory are modified. A number of files of the oldest and of the most recent backup are identical and are therefore redundant, requiring time and bandwidth to be transferred via the network and taking up unnecessary space on the disk after copying. If we compare the files of the oldest backup with the files that we are going to copy from the remote server, it is possible to distinguish the set of files that has been modified from the files which have  remained unchanged. With the incremental backup it thus becomes possible to eliminate redundancy, copying only the modified files.

Implementation and tangible benefits

We developed this feature by setting ourselves three objectives:

  • reduction of backup execution time;
  • reduction of bandwidth usage;
  • reduction of space taken up, accomplished by eliminating redundancies (deduplication).

To achieve this, we exploited the capacity of Rsync to compare a list of files received from a remote server with the content of a local directory, identifying which had been modified. We thus added a new option for server/global configuration called reuse_backup. This option identifies the type of backup to be performed. Let’s look at the three possible values of ‘reuse_backup’ and their effects:

  • off: default value, classic backup;
  • copy: identifies on the remote server the modified files using the last backup performed as a basis. Only the files that have changed are transferred over the network, reducing the execution time of a backup and saving bandwidth. At the end of the transfer the unmodified files are copied, thus creating a full backup;
  • link: identifies the modified files and makes a copy of them, exactly like the copy option. At the end of the transfer, the reuse of the unmodified files is obtained using hard links instead of copying the files. This optimisation of disk space occupied by the backup removes any redundancy (deduplication).

It is also possible to use the option --reuse-backup [{copy, link, off}] from the command line to change the default behaviour for an individual backup. For example:


gt; barman backup --reuse-backup link main

…will  force reuse of the backup, using hard links regardless of the value set within the configuration file.

I will now use Navionics as a case study, one of our customers and the sponsor of this release which, as we shall see, gained strong advantages from the use of the incremental backup. Navionics has very large databases (one of the largest is approximately 13 TiB). Before the introduction of the incremental backup, taking into account the characteristics of the server and network:

  • approximately 52 hours would have been needed to complete a backup;
  • 13 TiB of data would actually have been copied through the network;
  • 13 TiB would actually have been taken up on the disk.

Using the option reuse_backup=link from the latest version of barman and doing a barman show-backup of a just-completed backup, this is what Navionics sees:

Base backup information:
  Disk usage           : 13.2 TiB (13.2 TiB with WALs)
  Incremental size     : 5.0 TiB (-62.01%)

Moreover, the backup execution time drops significantly from 52 hours to approximately 17 hours. The advantages are obvious:

  • the execution time decreases by approximately 68%;
  • only TiB 5.0 of data is copied instead of 13 TiB (-62%);
  • the disk space taken up is 5.0 TiB instead of 13 TiB (-62%).

pg_stat_archiver: integration into Barman 1.4.0

Among the new features  introduced by PostgreSQL 9.4 is the view pg_stat_archiver that provides useful information regarding the operating status of the WAL storage process. Thanks to these statistics, it is also possible to make predictions on the space that a new backup will occupy. Users of Barman 1.4.0 and PostgreSQL 9.4 may notice f a number of new fields within the output of the following commands:

  • barman check:
    • the Boolean field is_archiving that indicates the status of the archiving process.
  • barman status:
    • last_archived_time reports the storage time of the last WAL file;
    • failed_count the number of failed WAL storage attempts;
    • server_archived_wals_per_hour the storage rate of WAL/hour;
  • barman show-server adds to the set of server statistics all fields that make up the view pg_stat_archiver.

Conclusions

The incremental backup, the main feature of this release, is undoubtedly a very useful tool for everyone, saving time and space, even on modest-size databases. It is almost indispensable for users who need to manage very large databases (VLDB) or databases that contain a large number of read-only tables, providing a significant increase in performance in terms of space, time and bandwidth utilised. Adding integration with pg_stat_archiver on Postgre SQL 9.4 improves the ability to monitor the status of servers and thus optimises the health and strength of infrastructures that choose Barman as a disaster recovery solution for the PostgreSQL database.

Tags: 1.4.0, backup, backup incrementale, backup reuse, Barman, Business Continuity, data deduplication, database, deduplicazione dati, disaster recovery, major release, open source, pg_stat_archiver, pgbarman, postgres, PostgreSQL, PostgreSQL 9.4, Recovery, reuse_backup
Share this entry
  • Share on Facebook
  • Share on Twitter
  • Share on WhatsApp
  • Share on LinkedIn
2 replies
  1. olivier Bernhard
    olivier Bernhard says:
    March 11, 2015 at 4:44 pm

    What happens if, in the mean time, a file that was identified as not being modified, is actually modified before the incremental backup has been completed ? I guess that in any case you’ll have to apply wals isn’t it ?

    Reply
    • Giulio Calacoci
      Giulio Calacoci says:
      March 31, 2015 at 11:46 am

      Barman uses rsync checksum copy to evaluate every file that have been modified after the start time of the backup used as reference.
      Doing so we are sure that we will copy every file that have been modified.

      Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

Get in touch with us!

Recent Posts

  • Random Data December 3, 2020
  • Webinar: COMMIT Without Fear – The Beauty of CAMO [Follow Up] November 13, 2020
  • Full-text search since PostgreSQL 8.3 November 5, 2020
  • Random numbers November 3, 2020
  • Webinar: Best Practices for Bulk Data Loading in PostgreSQL [Follow Up] November 2, 2020

Featured External Blogs

Tomas Vondra's Blog

Our Bloggers

  • Simon Riggs
  • Alvaro Herrera
  • Andrew Dunstan
  • Craig Ringer
  • Francesco Canovai
  • Gabriele Bartolini
  • Giulio Calacoci
  • Ian Barwick
  • Marco Nenciarini
  • Mark Wong
  • Pavan Deolasee
  • Petr Jelinek
  • Shaun Thomas
  • Tomas Vondra
  • Umair Shahid

PostgreSQL Cloud

2QLovesPG 2UDA 9.6 backup Barman BDR Business Continuity community conference database DBA development devops disaster recovery greenplum Hot Standby JSON JSONB logical replication monitoring OmniDB open source Orange performance PG12 pgbarman pglogical PG Phriday postgres Postgres-BDR postgres-xl PostgreSQL PostgreSQL 9.6 PostgreSQL10 PostgreSQL11 PostgreSQL 11 PostgreSQL 11 New Features postgresql repmgr Recovery replication security sql wal webinar webinars

Support & Services

24/7 Production Support

Developer Support

Remote DBA for PostgreSQL

PostgreSQL Database Monitoring

PostgreSQL Health Check

PostgreSQL Performance Tuning

Database Security Audit

Upgrade PostgreSQL

PostgreSQL Migration Assessment

Migrate from Oracle to PostgreSQL

Products

HA Postgres Clusters

Postgres-BDR®

2ndQPostgres

pglogical

repmgr

Barman

Postgres Cloud Manager

SQL Firewall

Postgres-XL

OmniDB

Postgres Installer

2UDA

Postgres Learning Center

Introducing Postgres

Blog

Webinars

Books

Videos

Training

Case Studies

Events

About Us

About 2ndQuadrant

What does 2ndQuadrant Mean?

News

Careers 

Team Profile

© 2ndQuadrant Ltd. All rights reserved. | Privacy Policy
  • Twitter
  • LinkedIn
  • Facebook
  • Youtube
  • Mail
JSONB type performance in PostgreSQL 9.4 Automating Barman with Puppet: it2ndq/barman (part one)
Scroll to top
×