2ndQuadrant is now part of EDB

Bringing together some of the world's top PostgreSQL experts.

2ndQuadrant | PostgreSQL
Mission Critical Databases
  • Contact us
  • EN
    • FR
    • IT
    • ES
    • DE
    • PT
  • Support & Services
  • Products
  • Downloads
    • Installers
      • Postgres Installer
      • 2UDA – Unified Data Analytics
    • Whitepapers
      • Business Case for PostgreSQL Support
      • Security Best Practices for PostgreSQL
    • Case Studies
      • Performance Tuning
        • BenchPrep
        • tastyworks
      • Distributed Clusters
        • ClickUp
        • European Space Agency (ESA)
        • Telefónica del Sur
        • Animal Logic
      • Database Administration
        • Agilis Systems
      • Professional Training
        • Met Office
        • London & Partners
      • Database Upgrades
        • Alfred Wegener Institute (AWI)
      • Database Migration
        • International Game Technology (IGT)
        • Healthcare Software Solutions (HSS)
        • Navionics
  • Postgres Learning Center
    • Webinars
      • Upcoming Webinars
      • Webinar Library
    • Whitepapers
      • Business Case for PostgreSQL Support
      • Security Best Practices for PostgreSQL
    • Blog
    • Training
      • Course Catalogue
    • Case Studies
      • Performance Tuning
        • BenchPrep
        • tastyworks
      • Distributed Clusters
        • ClickUp
        • European Space Agency (ESA)
        • Telefónica del Sur
        • Animal Logic
      • Database Administration
        • Agilis Systems
      • Professional Training
        • Met Office
        • London & Partners
      • Database Upgrades
        • Alfred Wegener Institute (AWI)
      • Database Migration
        • International Game Technology (IGT)
        • Healthcare Software Solutions (HSS)
        • Navionics
    • Books
      • PostgreSQL 11 Administration Cookbook
      • PostgreSQL 10 Administration Cookbook
      • PostgreSQL High Availability Cookbook – 2nd Edition
      • PostgreSQL 9 Administration Cookbook – 3rd Edition
      • PostgreSQL Server Programming Cookbook – 2nd Edition
      • PostgreSQL 9 Cookbook – Chinese Edition
    • Videos
    • Events
    • PostgreSQL
      • PostgreSQL – History
      • Who uses PostgreSQL?
      • PostgreSQL FAQ
      • PostgreSQL vs MySQL
      • The Business Case for PostgreSQL
      • Security Information
      • Documentation
  • About Us
    • About 2ndQuadrant
    • 2ndQuadrant’s Passion for PostgreSQL
    • News
    • Careers
    • Team Profile
  • Blog
  • Menu Menu
You are here: Home1 / Blog2 / John’s PlanetPostgreSQL3 / Optimizing storage of small tables in PostgreSQL 12
John Naylor

Optimizing storage of small tables in PostgreSQL 12

April 10, 2019/0 Comments/in John’s PlanetPostgreSQL /by John Naylor

The problem

If your database has a large number of small tables, you likely have a lot of wasted space. To demonstrate this, let’s create a table with a single record:

create table foo (str text);
insert into foo values ('a');
VACUUM foo;

Now let’s find out the path of the file containing our data, relative to the data directory:

select pg_relation_filepath('foo');
 pg_relation_filepath
----------------------
 base/16384/16405
(1 row)

(For more information on PostgreSQL files and directories, see Craig Ringer’s article on the topic.)

Notice what happens when we drop to the system command line within the data directory and list the files, adding a wild card to the path above:

$ ls base/16384/16405*

base/16384/16405
base/16384/16405_fsm
base/16384/16405_vm

The file named 16405 is the heap of our table, but what about the others? These are auxiliary files, called “relation forks” that contain additional information to help PostgreSQL access and maintain the table.

  • 16405_fsm is the free space map. Its job is to know which pages in the table have space available for inserting records.
  • 16405_vm is the visibility map. Its job is to know which heap pages may need VACUUM-ing or freezing, and also which heap pages must be visited during index-only scans.

There are other files associated with this table, but to find them we have to use a query:

select
pg_relation_filepath(c.reltoastrelid) as toast_table_path,
pg_relation_filepath(i.indexrelid) as toast_index_path
from pg_class c
left outer join pg_index i on c.reltoastrelid=i.indrelid
where c.relname = 'foo';

 toast_table_path | toast_index_path
------------------+------------------
 base/16384/16408 | base/16384/16410
(1 row)

This gives us the path of the toast table for our table, as well as that of the toast table’s index. If we insert any records with large enough strings, they will be compressed and stored here.

How much disk space do we need to store our one record?

\x
select
pg_relation_size(c.oid, 'main') as heap_size,
pg_relation_size(c.oid, 'fsm') as fsm_size,
pg_relation_size(c.oid, 'vm') as vm_size,
pg_relation_size(c.reltoastrelid) as toast_table_size,
pg_relation_size(i.indexrelid) as toast_index_size
from pg_class c
left outer join pg_index i on c.reltoastrelid=i.indrelid
where c.relname = 'foo';

-[ RECORD 1 ]----+------
heap_size        | 8192
fsm_size         | 24576
vm_size          | 8192
toast_table_size | 0
toast_index_size | 8192

To store a record with one string value, we need five files totalling 48kB of space! You may be thinking “So what? My tables are many gigabytes in size — why should I worry about a few kB?”. Well, consider the situation where an application uses schemas for multi-tenancy. Such a case could have thousands of schemas, each with hundreds of small tables, and the amount of space taken up by auxiliary files quickly becomes very high.

Is there anything we can do to reduce the amount of space consumed?

For starters, we can get rid of the toast table and its toast index by using a constraint on the length of the column:

create table foo_no_toast (str varchar(500));
insert into foo_no_toast values ('a');
VACUUM foo_no_toast;

select
pg_relation_size(c.oid, 'main') as heap_size,
pg_relation_size(c.oid, 'fsm') as fsm_size,
pg_relation_size(c.oid, 'vm') as vm_size,
pg_relation_size(c.reltoastrelid) as toast_table_size,
pg_relation_size(i.indexrelid) as toast_index_size
from pg_class c
left outer join pg_index i on c.reltoastrelid=i.indrelid
where c.relname = 'foo_no_toast';

-[ RECORD 1 ]----+------
heap_size        | 8192
fsm_size         | 24576
vm_size          | 8192
toast_table_size |
toast_index_size |

So now we have 3 files totalling 40 kB. This works because internally Postgres has a threshold tuple size beyond which it has to store the tuple in the toast table. In our new table here, Postgres knows this table can’t have any tuples exceeding this threshold, so it doesn’t bother creating the toast table.

This isn’t a big improvement, however. For one, we only saved a little bit of space. For another, it’s difficult to ensure the constraint on the column type is large enough to allow any string your application might want to insert, yet small enough to prevent the creation of the toast table. I obtained 500 in my example with a little experimentation, but the result will vary depending on the number of columns, the database encoding, and possibly on the major version of PostgreSQL.

Wouldn’t it be nice if we could get rid of the free space map? After all, it’s 3x bigger than the heap it’s keeping track of! It turns out there will soon be an easy way to do that.

Note: A previous version of this article mentioned that in PG12 small tables would not have a free space map. This feature has been reverted last-minute due to architectural concerns. I will update this article if it reappears in a future release.

Tags: Optimization storage, PG12, Postgres12, PostgreSQL 12
Share this entry
  • Share on Facebook
  • Share on Twitter
  • Share on WhatsApp
  • Share on LinkedIn
0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

Get in touch with us!

Recent Posts

  • Random Data December 3, 2020
  • Webinar: COMMIT Without Fear – The Beauty of CAMO [Follow Up] November 13, 2020
  • Full-text search since PostgreSQL 8.3 November 5, 2020
  • Random numbers November 3, 2020
  • Webinar: Best Practices for Bulk Data Loading in PostgreSQL [Follow Up] November 2, 2020

Featured External Blogs

Tomas Vondra's Blog

Our Bloggers

  • Simon Riggs
  • Alvaro Herrera
  • Andrew Dunstan
  • Craig Ringer
  • Francesco Canovai
  • Gabriele Bartolini
  • Giulio Calacoci
  • Ian Barwick
  • Marco Nenciarini
  • Mark Wong
  • Pavan Deolasee
  • Petr Jelinek
  • Shaun Thomas
  • Tomas Vondra
  • Umair Shahid

PostgreSQL Cloud

2QLovesPG 2UDA 9.6 backup Barman BDR Business Continuity community conference database DBA development devops disaster recovery greenplum Hot Standby JSON JSONB logical replication monitoring OmniDB open source Orange performance PG12 pgbarman pglogical PG Phriday postgres Postgres-BDR postgres-xl PostgreSQL PostgreSQL 9.6 PostgreSQL10 PostgreSQL11 PostgreSQL 11 PostgreSQL 11 New Features postgresql repmgr Recovery replication security sql wal webinar webinars

Support & Services

24/7 Production Support

Developer Support

Remote DBA for PostgreSQL

PostgreSQL Database Monitoring

PostgreSQL Health Check

PostgreSQL Performance Tuning

Database Security Audit

Upgrade PostgreSQL

PostgreSQL Migration Assessment

Migrate from Oracle to PostgreSQL

Products

HA Postgres Clusters

Postgres-BDR®

2ndQPostgres

pglogical

repmgr

Barman

Postgres Cloud Manager

SQL Firewall

Postgres-XL

OmniDB

Postgres Installer

2UDA

Postgres Learning Center

Introducing Postgres

Blog

Webinars

Books

Videos

Training

Case Studies

Events

About Us

About 2ndQuadrant

What does 2ndQuadrant Mean?

News

Careers 

Team Profile

© 2ndQuadrant Ltd. All rights reserved. | Privacy Policy
  • Twitter
  • LinkedIn
  • Facebook
  • Youtube
  • Mail
PostgreSQL with passphrase-protected SSL keys under systemd PG Phriday: Around the World in Two Billion Transactions
Scroll to top
×