2ndQuadrant is now part of EDB

Bringing together some of the world's top PostgreSQL experts.

2ndQuadrant | PostgreSQL
Mission Critical Databases
  • Contact us
  • EN
    • FR
    • IT
    • ES
    • DE
    • PT
  • Support & Services
  • Products
  • Downloads
    • Installers
      • Postgres Installer
      • 2UDA – Unified Data Analytics
    • Whitepapers
      • Business Case for PostgreSQL Support
      • Security Best Practices for PostgreSQL
    • Case Studies
      • Performance Tuning
        • BenchPrep
        • tastyworks
      • Distributed Clusters
        • ClickUp
        • European Space Agency (ESA)
        • Telefónica del Sur
        • Animal Logic
      • Database Administration
        • Agilis Systems
      • Professional Training
        • Met Office
        • London & Partners
      • Database Upgrades
        • Alfred Wegener Institute (AWI)
      • Database Migration
        • International Game Technology (IGT)
        • Healthcare Software Solutions (HSS)
        • Navionics
  • Postgres Learning Center
    • Webinars
      • Upcoming Webinars
      • Webinar Library
    • Whitepapers
      • Business Case for PostgreSQL Support
      • Security Best Practices for PostgreSQL
    • Blog
    • Training
      • Course Catalogue
    • Case Studies
      • Performance Tuning
        • BenchPrep
        • tastyworks
      • Distributed Clusters
        • ClickUp
        • European Space Agency (ESA)
        • Telefónica del Sur
        • Animal Logic
      • Database Administration
        • Agilis Systems
      • Professional Training
        • Met Office
        • London & Partners
      • Database Upgrades
        • Alfred Wegener Institute (AWI)
      • Database Migration
        • International Game Technology (IGT)
        • Healthcare Software Solutions (HSS)
        • Navionics
    • Books
      • PostgreSQL 11 Administration Cookbook
      • PostgreSQL 10 Administration Cookbook
      • PostgreSQL High Availability Cookbook – 2nd Edition
      • PostgreSQL 9 Administration Cookbook – 3rd Edition
      • PostgreSQL Server Programming Cookbook – 2nd Edition
      • PostgreSQL 9 Cookbook – Chinese Edition
    • Videos
    • Events
    • PostgreSQL
      • PostgreSQL – History
      • Who uses PostgreSQL?
      • PostgreSQL FAQ
      • PostgreSQL vs MySQL
      • The Business Case for PostgreSQL
      • Security Information
      • Documentation
  • About Us
    • About 2ndQuadrant
    • 2ndQuadrant’s Passion for PostgreSQL
    • News
    • Careers
    • Team Profile
  • Blog
  • Menu Menu
You are here: Home1 / Blog2 / Greg's PlanetPostgreSQL3 / Tuning Linux for low PostgreSQL latency
2ndQuadrant Press

Tuning Linux for low PostgreSQL latency

January 28, 2011/2 Comments/in Greg's PlanetPostgreSQL, United States News /by 2ndQuadrant Press

One of the ugly parts of Linux with PostgreSQL is that the OS will happily cache up to around 5% of memory before getting aggressive about writing it out.  I’ve just updated a long list of pgbench runs showing how badly that can turn out, even on a server with a modest 16GB of RAM.  Note that I am intentionally trying to introduce the bad situation here, so this is not typical performance.  The workload that pgbench generates is not representative of any real-world workload, it’s as write-intensive as it’s possible to be.

Check out test set 5, which is running a stock development version PostgreSQL 9.1.  Some the pauses where the database is unresponsive during checkpoints, as shown by the max_latency figure there (which is in milliseconds), regularly exceed 40 seconds.  And at high client counts you can see >80 seconds of the database completely stalled.

It used to be possible to improve these by tuning Linux’s dirty_ratio and dirty_background ratio parameters. Nowadays system RAM is so large that even the minimum settings possible there are caching way too much.  A better UI was introduced in kernel 2.6.29.  Now you can set these in bytes instead, which allows much smaller settings, using dirty_bytes and dirty_background_bytes.

You can see what happens when you tune those down by looking at test set #7 [Note that this also includes a work in progress PostgreSQL patch to fix some other bad behavior in this area, introduced in set #6]  I set the new dirty_* parameters to 64MB/128MB, trying to be below the 256MB battery-backed cache in the RAID card.  The parts that
improved are quite obvious.  Now maximum latency on the smaller tests drops to <10 seconds.  And the worst one shown is just over 20 seconds now.  Both of these are about 1/4 of the worst-case latency shown before I tweaked these parameters.

There is a significant drop in transactions/second, too, but not necessarily an unacceptable one.  Across the whole set the averages were scale=500, tps=690 and scale=1000, tps=349 before; now it’s scale=500, tps=611 and scale=1000, tps=301.  Batching up work into larger chunks always gives higher transaction rates but lower latency. That part would be a completely reasonable trade-off in a lot of situations:  1/4 the worst-case latency for a 10-15% drop in TPS.

Unfortunately I discovered a bigger downside here too.  In between each pgbench test, there is some basic table clean up done to try and make the tests more repeatable.  Part of that is doing a VACUUM of the database.  I noticed that this test series took way longer to finish than any previous one.  The figure you can compare reasonably here is to ask “what’s the minimum time seen to clean the database up?”.  That varies a bit from test to test, but the minimums across a larger set are quite consistent.  It takes a certain amount of time to do that job, and the fastest such cleanup job is a pretty stable figure here at a given database size.

Here are those figures from the last 3 test sets:


 set | scale |  min_setup_time  
-----+-------+------------------
   5 |   500 | 00:03:41.220735
   5 |  1000 | 00:06:47.818328
   6 |   500 | 00:03:33.139611
   6 |  1000 | 00:06:41.154474
   7 |   500 | 00:06:06.56808
   7 |  1000 | 00:10:14.010876

You can see that test sets 5 and 6 both took around 3.5 minutes to cleanup a scale=500 database, and 6.75 minutes for scale=1000.  Dialing down the cached memory Linux was keeping around increased those times to 6 minutes and 10 minutes instead.  That’s 71% and 48% longer, respectively, than those operations took with a large amount of write caching managed by the kernel.

Now that is a much harder price to pay for decreased latency.  Looks like some of the server-side work planned to try and reduce these checkpoint spikes is still completely relevant even on a kernel that has these knobs available.

Tags: PostgreSQL
Share this entry
  • Share on Facebook
  • Share on Twitter
  • Share on WhatsApp
  • Share on LinkedIn
2 replies
  1. xaprb
    xaprb says:
    January 29, 2011 at 11:28 am

    Great job benchmarking and proving that the checkpoint spike problem isn’t solved. I am proposing a Checkpoint Blues birds-of-a-feather session at the MySQL conference, and I hope we can learn about how a lot of different transactional systems do checkpoints, and maybe get some ideas from them.

    Reply
  2. Pallab
    Pallab says:
    December 20, 2015 at 9:53 pm

    Awesome! Hope this trend of performance iperovemmnts continues!The problem with Linux is that it’s trying to be all things to all people, on dozens of architectures, from phones to desktops to servers to mainframes to TOP500 supercomputers. By focusing on mid-range x64 servers, DragonFly can achieve better performance than Linux in that segment, and then suddenly its usage numbers will skyrocket!Perhaps additional performance benefits can be attained by standardizing a server stack of copyfree (permissively licensed, like BSD) components that are tuned to work together: DragonFly/HAMMER2, LLVM/Clang, PostgreSQL, nginx, etc

    Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

Get in touch with us!

Recent Posts

  • Random Data December 3, 2020
  • Webinar: COMMIT Without Fear – The Beauty of CAMO [Follow Up] November 13, 2020
  • Full-text search since PostgreSQL 8.3 November 5, 2020
  • Random numbers November 3, 2020
  • Webinar: Best Practices for Bulk Data Loading in PostgreSQL [Follow Up] November 2, 2020

Featured External Blogs

Tomas Vondra's Blog

Our Bloggers

  • Simon Riggs
  • Alvaro Herrera
  • Andrew Dunstan
  • Craig Ringer
  • Francesco Canovai
  • Gabriele Bartolini
  • Giulio Calacoci
  • Ian Barwick
  • Marco Nenciarini
  • Mark Wong
  • Pavan Deolasee
  • Petr Jelinek
  • Shaun Thomas
  • Tomas Vondra
  • Umair Shahid

PostgreSQL Cloud

2QLovesPG 2UDA 9.6 backup Barman BDR Business Continuity community conference database DBA development devops disaster recovery greenplum Hot Standby JSON JSONB logical replication monitoring OmniDB open source Orange performance PG12 pgbarman pglogical PG Phriday postgres Postgres-BDR postgres-xl PostgreSQL PostgreSQL 9.6 PostgreSQL10 PostgreSQL11 PostgreSQL 11 PostgreSQL 11 New Features postgresql repmgr Recovery replication security sql wal webinar webinars

Support & Services

24/7 Production Support

Developer Support

Remote DBA for PostgreSQL

PostgreSQL Database Monitoring

PostgreSQL Health Check

PostgreSQL Performance Tuning

Database Security Audit

Upgrade PostgreSQL

PostgreSQL Migration Assessment

Migrate from Oracle to PostgreSQL

Products

HA Postgres Clusters

Postgres-BDR®

2ndQPostgres

pglogical

repmgr

Barman

Postgres Cloud Manager

SQL Firewall

Postgres-XL

OmniDB

Postgres Installer

2UDA

Postgres Learning Center

Introducing Postgres

Blog

Webinars

Books

Videos

Training

Case Studies

Events

About Us

About 2ndQuadrant

What does 2ndQuadrant Mean?

News

Careers 

Team Profile

© 2ndQuadrant Ltd. All rights reserved. | Privacy Policy
  • Twitter
  • LinkedIn
  • Facebook
  • Youtube
  • Mail
Reducing the postgresql.conf, parameter at a time Hinting at PostgreSQL
Scroll to top
×