Barman Cloud – Part 1: WAL Archive
How many current Barman users have thought about saving backups in a remote destination in the cloud? How many have thought about taking that backup directly from the PostgreSQL server itself?
Well, since Barman 2.10 this is now possible!
Let’s discover that together in the following articles.
The following two articles are meant to be a practical introduction to the new
barman-cloud-backup tools added in the
The first part will cover the
barman-cloud-wal-archive command while the second one will cover the
Readers need a basic knowledge of PostgreSQL WAL archiving and backup methods, and Barman. It is also recommended that you are aware of cloud technologies for storage solutions like Amazon S3.
Barman has acted as a remote WAL archive for many years, and the Barman CLI package has been designed to extend archiving reliability and robustness on the PostgreSQL side. In fact
barman-cli provides scripts like
barman-wal-restore allowing a standby node to smartly and safety restore WAL files from a Barman archive through the
restore_command parameter in the
postgresql.auto.conf file (or
recovery.conf file until PostgreSQL 12), and
barman-wal-archive to archive WAL files from a master node to Barman through the
archive_command parameter configured in the
Cloud WAL Archive
Thanks to users’ feedback, the Barman developers have introduced two new tools in version 2.10:
Version 2.11 will include two additional tools for recovery, called
This post is entirely dedicated to
barman-cloud-wal-archive, which can store WAL files in the cloud, enabling multi-tier archiving with Barman, and expanding the backups retention policy.
barman-cloud-wal-archive can be used as a hook-script configuring the
pre_archive_retry_script parameter in Barman, to copy WAL files in the configured cloud storage, increasing the redundancy of the archive, and making it possible to choose a longer retention policy than the Barman one.
That’s not all!
barman-cloud-wal-archive can replace the
barman-wal-archive command in the
archive_command parameter, to directly archive WAL files in the cloud, instead of copying them into the Barman server. In this way, even a PostgreSQL cluster that does not have a separate dedicated backup server can rely on remote storage service to archive WAL files.
How does it work?
The following instructions are just to install and configure
barman-cloud-wal-archive as the
archive_command in PostgreSQL.
First, decide where to archive WAL files. In this article we will use Amazon S3, which, at the moment of writing is the only technology supported. Although other technologies that support S3-like API (Google Cloud, DigitalOcean, Microsoft Azure, etc.) can work with boto3 library, they have not been tested yet.
- barman-cli 2.10 (or higher)
- Amazon AWS account
- S3 bucket
- A PostgreSQL instance
In this article we will test Barman CLI in a virtual machine with Debian Buster and PostgreSQL 12 which is already up and running.
- Install the 2ndQuadrant Public repository
- Install the barman-cli package
[email protected]:~# apt update [email protected]:~# apt install barman-cli
- Install awscli
[email protected]:~# apt install awscli
Configuration and setup
Let’s read the manual:
[email protected]:~$ man barman-cloud-wal-archive [...] SYNOPSIS barman-cloud-wal-archive [OPTIONS] DESTINATION_URL SERVER_NAME WAL_PATH [...] POSITIONAL ARGUMENTS DESTINATION_URL URL of the cloud destination, such as a bucket in AWS S3. For example: s3://BUCKET_NAME/path/to/folder (where BUCKET_NAME is the bucket you have created in AWS). SERVER_NAME the name of the server as configured in Barman. WAL_PATH the value of the `%p' keyword (according to `archive_command'). [...]
So, to properly use it we just need to configure AWS credentials with the
awsclitool as the
postgresuser, copying the Access Key and Secret Key previously created in the IAM section in AWS console:
[email protected]:~$ aws configure --profile barman-cloud AWS Access Key ID [None]: AKI***************** AWS Secret Access Key [None]: **************************************** Default region name [None]: eu-west-1 Default output format [None]: json
Ensure to have an available S3 bucket on AWS. I chose to call it
barman-s3-testto make it clear.
We should be able now to test the
[email protected]:~$ barman-cloud-wal-archive -t -P barman-cloud s3://barman-s3-test/ pg12 /var/lib/postgresql/12/main/pg_wal/000000010000000000000001 [email protected]:~$ echo $? 0
The exit status confirms that the command succeeded. We can now add the following line at the bottom of the PostgreSQL configuration file and restart the instance:
archive_mode = on
[email protected]:~# systemctl restart [email protected]
Since our data will be copied in a remote storage, outside our control, it’s important that we store them compressed and encrypted. The
barman-cloud-wal-archivecommand supports two different methods for compression:
[email protected]:~$ barman-cloud-wal-archive --help [...] -z, --gzip gzip-compress the WAL while uploading to the cloud -j, --bzip2 bzip2-compress the WAL while uploading to the cloud -e ENCRYPTION, --encryption ENCRYPTION Enable server-side encryption for the transfer. Allowed values: 'AES256', 'aws:kms' [...]
The encryption option will just inform the S3 bucket which method to use to store the data encrypted. Encrypted data cannot be read by any other AWS user but the owner of the bucket. Barman cloud does not encrypt any object before sending it to S3, it just asks the bucket to store them encrypted if S3 has been properly configured. However, any connections to S3 are securely established via
Let’s add the following line at the bottom of the
archive_command = 'barman-cloud-wal-archive -P barman-cloud -e AES256 -j s3://barman-s3-test/ pg12 %p'
This time, just a reload of the configuration is enough to apply the new changes:
[email protected]:~$ psql -c “SELECT pg_reload_conf()”
In order to test whether the new archive_command is working, PostgreSQL should produce WAL files to be archived, therefore we have to make some traffic with the help of the
[email protected]:~$ createdb pg_bench_db [email protected]:~$ pgbench -i -s10 pg_bench_db [some irrelevant output here] [email protected]:~$ pgbench -c 10 -j 2 -T 30 pg_bench_db starting vacuum...end. transaction type: <builtin: TPC-B (sort of)> scaling factor: 10 query mode: simple number of clients: 10 number of threads: 2 duration: 30 s number of transactions actually processed: 84501 latency average = 3.552 ms tps = 2815.224687 (including connections establishing) tps = 2815.427535 (excluding connections establishing)
At this point we should see WAL files archived in the S3 bucket. Let’s check it, building the target path with the server name and the WAL destination directory:
[email protected]:~$ aws s3 --profile barman-cloud ls s3://barman-s3-test/pg12/wals/ PRE 0000000100000000/
Let’s have a look inside the 0000000100000000 directory:
[email protected]:~$ aws s3 --profile barman-cloud ls s3://barman-s3-test/pg12/wals/0000000100000000/ 2020-01-08 08:20:54 1624168 000000010000000000000001.bz2 2020-01-08 08:21:00 293422 000000010000000000000002.bz2 2020-01-08 08:21:06 301934 000000010000000000000003.bz2 2020-01-08 08:21:11 295648 000000010000000000000004.bz2 2020-01-08 08:21:16 293675 000000010000000000000005.bz2 2020-01-08 08:21:21 299348 000000010000000000000006.bz2 2020-01-08 08:21:27 551249 000000010000000000000007.bz2 2020-01-08 08:21:33 976523 000000010000000000000008.bz2 2020-01-08 08:21:37 4542104 000000010000000000000009.bz2 2020-01-08 08:21:46 5052693 00000001000000000000000A.bz2
WAL files are being compressed before being uploaded to the S3 bucket and are stored encrypted, saving us space (and money) and increasing the security level of our data.
barman-cloud-wal-archivecommand is what users have waited for a long time.
If you’re one of those who has used
pre_archive_retry_scriptto implement a custom script for uploading WAL files to an S3 bucket, then this can be used as a better replacement because it is developed and maintained by Barman developers, and it is tested and delivered by the 2ndQuadrant Continuous Delivery system.
In case you haven’t thought about it yet, this opens up new retention policies which can be longer for cloud storage than the Barman local ones, increasing the objects’ age in the cloud, while saving space on the local storage, by properly setting a longer retention policy in the S3 buckets’ configuration.
Otherwise, it can be used as we did in this article, to archive WAL files directly from the PostgreSQL server. Although this removes an intermediate step, the RPO increases compared with the streaming method, because PostgreSQL will archive WAL file only after having closed it. Therefore in case of problems on the PostgreSQL node, we could lose some changes. When possible, we recommend implementing this method along with the streaming to a Barman server in order to achieve RPO=0 (with synchronous streaming).
Now that we have a continuous archiving system in place, we can take our first cloud backup using the
See you in the second part of the article.
There is any plan to support the GCP cloud bucket?
where is part two?
What do you put in the barman.conf for these variables?
streaming_wals_directory, basebackups_directory, incoming_wals_directory and wals_directory
Hi Jonathan, It’s really a great article and have one doubt in the below statement.Can you plz help me to understand 2.11 new feature in wal encryption.
Barman cloud does not encrypt any object before sending it to S3, it just asks the bucket to store them encrypted if S3 has been properly configured. However, any connections to S3 are securely established via https.
— It means no way we can implement in barman encryption before send to S3? Thanks
One Q, when the other Cloud technologies will be supported?