2ndQuadrant is now part of EDB

Bringing together some of the world's top PostgreSQL experts.

2ndQuadrant | PostgreSQL
Mission Critical Databases
  • Contact us
  • EN
    • FR
    • IT
    • ES
    • DE
    • PT
  • Support & Services
  • Products
  • Downloads
    • Installers
      • Postgres Installer
      • 2UDA – Unified Data Analytics
    • Whitepapers
      • Business Case for PostgreSQL Support
      • Security Best Practices for PostgreSQL
    • Case Studies
      • Performance Tuning
        • BenchPrep
        • tastyworks
      • Distributed Clusters
        • ClickUp
        • European Space Agency (ESA)
        • Telefónica del Sur
        • Animal Logic
      • Database Administration
        • Agilis Systems
      • Professional Training
        • Met Office
        • London & Partners
      • Database Upgrades
        • Alfred Wegener Institute (AWI)
      • Database Migration
        • International Game Technology (IGT)
        • Healthcare Software Solutions (HSS)
        • Navionics
  • Postgres Learning Center
    • Webinars
      • Upcoming Webinars
      • Webinar Library
    • Whitepapers
      • Business Case for PostgreSQL Support
      • Security Best Practices for PostgreSQL
    • Blog
    • Training
      • Course Catalogue
    • Case Studies
      • Performance Tuning
        • BenchPrep
        • tastyworks
      • Distributed Clusters
        • ClickUp
        • European Space Agency (ESA)
        • Telefónica del Sur
        • Animal Logic
      • Database Administration
        • Agilis Systems
      • Professional Training
        • Met Office
        • London & Partners
      • Database Upgrades
        • Alfred Wegener Institute (AWI)
      • Database Migration
        • International Game Technology (IGT)
        • Healthcare Software Solutions (HSS)
        • Navionics
    • Books
      • PostgreSQL 11 Administration Cookbook
      • PostgreSQL 10 Administration Cookbook
      • PostgreSQL High Availability Cookbook – 2nd Edition
      • PostgreSQL 9 Administration Cookbook – 3rd Edition
      • PostgreSQL Server Programming Cookbook – 2nd Edition
      • PostgreSQL 9 Cookbook – Chinese Edition
    • Videos
    • Events
    • PostgreSQL
      • PostgreSQL – History
      • Who uses PostgreSQL?
      • PostgreSQL FAQ
      • PostgreSQL vs MySQL
      • The Business Case for PostgreSQL
      • Security Information
      • Documentation
  • About Us
    • About 2ndQuadrant
    • 2ndQuadrant’s Passion for PostgreSQL
    • News
    • Careers
    • Team Profile
  • Blog
  • Menu Menu
You are here: Home1 / Blog2 / 2ndQuadrant3 / What does pg_start_backup() do?
Simon Riggs

What does pg_start_backup() do?

January 23, 2017/5 Comments/in 2ndQuadrant, Simon's PlanetPostgreSQL /by Simon Riggs

Reading mailing lists can damage your health, as I recently discovered on the PostgreSQL Performance list where backup was being discussed.

First off, don’t read blogs for finding out critical pieces of info. Read the docs because they are accurate, fully reviewed and well maintained.

I should add that I was the initial author of them as well, so maybe it’s OK to carry on reading…

pg_start_backup() is a function we execute to start a base backup. It was part of the original API for physical backup introduced in PostgreSQL 8.0. It’s now been mostly superceded by the replication command BASE_BACKUP, which is most frequently executed by the pg_basebackup utility.

So what does a base backup actually do? Well, first we execute a checkpoint so that as many changed data blocks are on disk as possible. Next we force full page writes to occur, even if full_page_writes = off, because we need to see the whole page for any changes. Lastly, we record the starting point of the backup. That’s all.

Base backup does NOT prevent writes to the data directory. It’s designed to be “fully online” so it doesn’t take locks on objects, doesn’t interefere with the operation of the database apart from some details if you try to shut it down while taking a backup.

pg_stop_backup() is the end marker for that backup.

The key point is that the base backup is NOT a consistent copy of the database. You might have copied every file, but all the data is taken at different times. So its wrong. Until you recover the database with the WAL changes that occurred between the start backup and the stop backup.

Which is why you’ll be wanting to use a command like this

pg_basebackup –xlog-method=stream

or use a utility that does everything for you, like Barman.

Share this entry
  • Share on Facebook
  • Share on Twitter
  • Share on WhatsApp
  • Share on LinkedIn
5 replies
  1. Adam Scott
    Adam Scott says:
    January 24, 2017 at 12:51 am

    Great reminder on pg_start_backup()!

    I’ve never tried a restore without the WAL files. I’m guessing there would be a complaint of missing WAL files. Looking through xlog.c (line 7196), I’m guessing you will see a message along the lines of: “WAL ends before end of online backup”.

    So when one performs their scheduled test recovery and you see that message, you know you aren’t getting consistent backups.

    Reply
  2. Tushar
    Tushar says:
    February 1, 2017 at 8:11 pm

    To the point explanation. thanks

    Reply
  3. EBB PostgreSQL
    EBB PostgreSQL says:
    January 23, 2019 at 3:28 pm

    Can you give an example of the issue if –xlog-method does not set to stream?

    Reply
  4. Francis Demierre
    Francis Demierre says:
    September 23, 2019 at 4:52 pm

    Great stuff…. thanks.

    Although you said:
    Base backup does NOT prevent writes to the data directory. It’s designed to be “fully online” so it doesn’t take locks on objects, doesn’t interefere with the operation of the database apart from some details if you try to shut it down while taking a backup.

    I have just have two questions (my observations make me wonder ….).

    1) does PostgreSQL continue to do regular checkpoints between pg_start_backup() and pg_stop_backup() ?
    2) does it continue to move WAL files from pg_log/pg_wal to the archive directory using the ‘archive’ defined command ?

    Thanks for a reply.
    Best Regards
    Francis

    Reply
    • craig.ringer
      craig.ringer says:
      November 4, 2019 at 1:04 pm

      (1) Yes PostgreSQL continues to perform checkpoints during base backups. A base backup doesn’t guarantee that you’ll see a consistent copy of the data as of the time the base backup started. It promises that you’ll get a consistent view of the data as it was after the backup finishes and the required WAL segments are applied during recovery. So PostgreSQL is free to delete files, etc; if it’s deleting them then they won’t be needed anymore to create a consistent copy of the end-of-backup state.

      (2) Yes, PostgreSQL continues to archive WAL when archive mode is enabled. It also continues to service streaming replication clients etc.

      I strongly suggest that you use pg_basebackup -X stream to have pg_basebackup copy WAL from the server at the same time as the base backup. If you’re concerned that the server may remove WAL too fast, have pg_basebackup use a streaming replication slot to ensure the needed WAL is retained. See the pg_basebackup documentation for details.

      Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

Get in touch with us!

Recent Posts

  • Random Data December 3, 2020
  • Webinar: COMMIT Without Fear – The Beauty of CAMO [Follow Up] November 13, 2020
  • Full-text search since PostgreSQL 8.3 November 5, 2020
  • Random numbers November 3, 2020
  • Webinar: Best Practices for Bulk Data Loading in PostgreSQL [Follow Up] November 2, 2020

Featured External Blogs

Tomas Vondra's Blog

Our Bloggers

  • Simon Riggs
  • Alvaro Herrera
  • Andrew Dunstan
  • Craig Ringer
  • Francesco Canovai
  • Gabriele Bartolini
  • Giulio Calacoci
  • Ian Barwick
  • Marco Nenciarini
  • Mark Wong
  • Pavan Deolasee
  • Petr Jelinek
  • Shaun Thomas
  • Tomas Vondra
  • Umair Shahid

PostgreSQL Cloud

2QLovesPG 2UDA 9.6 backup Barman BDR Business Continuity community conference database DBA development devops disaster recovery greenplum Hot Standby JSON JSONB logical replication monitoring OmniDB open source Orange performance PG12 pgbarman pglogical PG Phriday postgres Postgres-BDR postgres-xl PostgreSQL PostgreSQL 9.6 PostgreSQL10 PostgreSQL11 PostgreSQL 11 PostgreSQL 11 New Features postgresql repmgr Recovery replication security sql wal webinar webinars

Support & Services

24/7 Production Support

Developer Support

Remote DBA for PostgreSQL

PostgreSQL Database Monitoring

PostgreSQL Health Check

PostgreSQL Performance Tuning

Database Security Audit

Upgrade PostgreSQL

PostgreSQL Migration Assessment

Migrate from Oracle to PostgreSQL

Products

HA Postgres Clusters

Postgres-BDR®

2ndQPostgres

pglogical

repmgr

Barman

Postgres Cloud Manager

SQL Firewall

Postgres-XL

OmniDB

Postgres Installer

2UDA

Postgres Learning Center

Introducing Postgres

Blog

Webinars

Books

Videos

Training

Case Studies

Events

About Us

About 2ndQuadrant

What does 2ndQuadrant Mean?

News

Careers 

Team Profile

© 2ndQuadrant Ltd. All rights reserved. | Privacy Policy
  • Twitter
  • LinkedIn
  • Facebook
  • Youtube
  • Mail
The rds_superuser role isn’t that super repmgr 3.3
Scroll to top
×