Management of the WAL archive in Barman
Barman, backup and recovery manager for PostgreSQL, is designed to manage the archive of WAL files separately from periodical backups (in Postgres terms, base backups).
You can see this archive as a “continuous” stream of files from the first available backup to the last shipped file (backup available history for a server).
In this article you will see how Barman manages WAL compression and archival, as well as how a particular WAL file is associated to a base backup in the history, granting database administrator the privilege to immediately know the exact physical size of a periodical backup (WAL files included) through the catalogue.
A quick recap:
- Backup is performed on a per server/instance basis (which means that single database backup is not supported)
- Backup servers in Barman are defined in the configuration file (each server has a section in the INI file, as described in the documentation)
The WAL archiver of the Postgres server asynchronously sends the newly generated WAL file containing binary transaction information to the backup server. WAL files must be deposited in the incoming_wals_directory for that server, as showed by the “barman show-server” command (by default, the convention is to place these files in the SERVER_BACKUP_DIR/incoming directory).
Usually, the best approach (especially in a local network), is to let Barman compress these 16MB files on the backup server end (you can use the “compression” option for this purpose). This operation is performed jointly with archival by the “barman cron” command, which goes through every file in the incoming directory, performs compression (where applicable) and permanently moves the WAL file in the WAL archive for that server. Bear in mind that this process is asynchronous and does not cause any delay for the normal archiving procedures of the server.
According to Barman design, every WAL file is automatically associated to the closest previous base backup available in the server’s backup history.
This definition implies that, in case no base backup has yet been taken, the cron command discards the incoming WAL files from the server’s archive.
The same definition however allows Barman to give DBAs more accurate information about the overall size of a periodical backup, including the size of the base backup (with tablespace data) and the number and size of the compressed WAL files.
The following screenshot for the “barman show-server” command gives you an idea (the test example does not take advantage of WAL compression – see the 48MBs occupied by the three WAL files?).
Another command that has an impact on the WAL archive is the “backup delete” command.
In case the first available backup is deleted, all the associated WAL files are removed from the archive as well.
Things get more interesting if you delete an intermediate backup. Consider this scenario: your server’s backup history shows three backups, taken at three subsequent times (t1, t2 and t3 with t3 > t2 > t1). If you get rid of the “t2” backup, the WAL files that were originally associated to this backup are not removed, rather immediately assigned to the previous available backup (“t1” in this example).
Among other things, I will be covering this topic as well at PostgreSQL Sessions in Paris on October 4th and Postgres Open in Chicago on September 19th. Hopefully I will be speaking about this at the next PostgreSQL European Conference in Prague, where my proposed talk on Disaster recovery with Barman is in the reserve list (if not, I will be there anyway, ready to share ideas about Barman and its future with you).
You can download Barman 1.0 from SourceForge.net, or you can get more information by visiting the website or joining our new #barman IRC channel on freenode. Ciao!
Nice tool that fits my needs perfectly. Thanks for this nice article as well as the wonderful PDF documentation(available on barman website). I have just two concerns:
-> What if I started the backup and an hour later decided to kill it. Do I need to cleanup the “bad” backup myself or barman can handle it.
-> “INCOMING_WAL_DIRECTORY”- can the location of this be configured. Currently my server
archives its WALS to a specific location on a backup machine. Can I have barman
use this location.
Thanks to the entire development team behind this amazing tool!
Hello,
first I want to thank for the nice words which I happily extend to the rest of the team.
As far as your question about canceling a backup is concerned, by default Barman does not delete an incomplete backup. However, it is labelled as ‘
FAILED
‘. You can then remove it at a later time and get rid of the partial files. Bear in mind that we’ve just committed in theHEAD
of the project in the GIT repository the support for hook scripts to be invoked before and after a backup. You can write your own script that after a backup check if everything went fine and if not, deletes the backup.In the future we might even add a configuration option for this.
As far as the
incoming_wals_directory
is regarded, you are completely free to change it. You can do it on a per-server basis. We designed Barman to follow the “convention over configuration” paradigm, which does not force you to specify every single parameter, while leaving you the possibility to do so. You just need to set that option to that directory in your server configuration section.Cheers,
Gabriele
Many thanks for the article of great use, but I have the following problem and I do not know if barman does it: when barman generates a backup in my home server it leaves a baseline, but it does not clean the old wal and accumulate until complete the disk, the point is if barman cleans the old wal from my main server? or should I manually apply pg_archivecleanup? Thank you