PostgreSQL uses MVCC to handle concurrent clients through snapshots instead of locks. This lets the server handle a larger transaction load and allows for a rich set of tools for developers to access data concurrently.
In order to get a deeper understanding of MVCC and Vacuum basics in PostgreSQL, as well as the pros and cons that come from its usage, 2ndQuadrant hosted a live webinar, MVCC and Vacuum Basics in PostgreSQL.
The webinar was presented by Martín Marqués (Deputy Head of Support at 2ndQuadrant), in which he covered the following topics:
- Overview of MVCC
- What “xmin” and “xmax” system columns store?
- Usage of “VACUUM” for clean up
- “autovacuum” for automated clean-up
- Visibility of rows, frozen rows and “VACUUM FREEZE”
Those who weren’t able to attend the live webinar can now view the recording here.
Due to limited time, some of the questions were not answered during the live webinar, so answers by the host are provided below:
Question: Is a deleted row the same as a frozen row?
Answer: No. A deleted row is one that is not visible for new sessions. A frozen row is one that is visible to all sessions.
Question: We have a small table (fixed 500 rows) and 50 updates per second. What is your recommendation/parameters for vacuuming this table?
Answer: That really depends. But you might want to fix in zero autovacuum_vacuum_scale_factor and give autovacuum_vacuum_threshold a fixed value larger than 50 so that a vacuum is executed every few minutes the table gets vacuumed. Make the vacuum very aggressive so it finishes quickly.
Question: We have a table of 1 billion rows and an hourly batch process inserting 5 million rows. What is your recommendation/parameters for vacuuming this table?
Answer: If the table only grows with inserts, what is needed is to keep the stats updated. Tune analyze so it runs more often.
Question: Regarding freeze, how can we check the progress of a running auto vacuum/vacuum freeze?
Answer: If you are on one of the latest releases of Postgres, you can see the progress of the vacuum using pg_stat_progress_vacuum.
Question: Does vacuum also need to be performed on the system catalog?
Answer: Yes, catalog tables are just like user tables and get vacuumed. They are normally small and don’t have indexes, so vacuum finishes quickly.
Question: For the auto vacuum daemon, is it better to have more vacuum workers (autovacuum_max_workers) with less memory (maintenance_work_mem) or fewer workers with more memory allocated to them?
Answer: It’s advisable not to increase autovacuum_max_workers to value larger than 6. It’s better to have fewer workers that run quickly (low cost_delay) so that another worker can vacuum another table afterward.
Question: Reference the second example you showed (xmin 3004 and 3008), inside a transaction when you run the update query “update pp set ts=…where id%2=0”, should id 2,4 has the same ts? Since they are in the same transaction so ts time should be the same, but the slice shows 14:18:55 18724-03 and 14:18:55 187276-03 difference.
Answer: That is because I didn’t use now() which returns the time at the beginning of the transaction, but instead used clock_timestamp() which changes during execution.
Question: Does freezing tuples help with performance?
Answer: Freeze by itself will not affect performance, but the vacuum that freezes tuples will also clean up dead tuples.
To be the first to know about upcoming PostgreSQL webinars by 2ndQuadrant, visit our Webinars page.
For any questions, comments, or feedback, please visit our website or send an email to [email protected].