2ndQuadrant is now part of EDB

Bringing together some of the world's top PostgreSQL experts.

2ndQuadrant | PostgreSQL
Mission Critical Databases
  • Contact us
  • EN
    • FR
    • IT
    • ES
    • DE
    • PT
  • Support & Services
  • Products
  • Downloads
    • Installers
      • Postgres Installer
      • 2UDA – Unified Data Analytics
    • Whitepapers
      • Business Case for PostgreSQL Support
      • Security Best Practices for PostgreSQL
    • Case Studies
      • Performance Tuning
        • BenchPrep
        • tastyworks
      • Distributed Clusters
        • ClickUp
        • European Space Agency (ESA)
        • Telefónica del Sur
        • Animal Logic
      • Database Administration
        • Agilis Systems
      • Professional Training
        • Met Office
        • London & Partners
      • Database Upgrades
        • Alfred Wegener Institute (AWI)
      • Database Migration
        • International Game Technology (IGT)
        • Healthcare Software Solutions (HSS)
        • Navionics
  • Postgres Learning Center
    • Webinars
      • Upcoming Webinars
      • Webinar Library
    • Whitepapers
      • Business Case for PostgreSQL Support
      • Security Best Practices for PostgreSQL
    • Blog
    • Training
      • Course Catalogue
    • Case Studies
      • Performance Tuning
        • BenchPrep
        • tastyworks
      • Distributed Clusters
        • ClickUp
        • European Space Agency (ESA)
        • Telefónica del Sur
        • Animal Logic
      • Database Administration
        • Agilis Systems
      • Professional Training
        • Met Office
        • London & Partners
      • Database Upgrades
        • Alfred Wegener Institute (AWI)
      • Database Migration
        • International Game Technology (IGT)
        • Healthcare Software Solutions (HSS)
        • Navionics
    • Books
      • PostgreSQL 11 Administration Cookbook
      • PostgreSQL 10 Administration Cookbook
      • PostgreSQL High Availability Cookbook – 2nd Edition
      • PostgreSQL 9 Administration Cookbook – 3rd Edition
      • PostgreSQL Server Programming Cookbook – 2nd Edition
      • PostgreSQL 9 Cookbook – Chinese Edition
    • Videos
    • Events
    • PostgreSQL
      • PostgreSQL – History
      • Who uses PostgreSQL?
      • PostgreSQL FAQ
      • PostgreSQL vs MySQL
      • The Business Case for PostgreSQL
      • Security Information
      • Documentation
  • About Us
    • About 2ndQuadrant
    • 2ndQuadrant’s Passion for PostgreSQL
    • News
    • Careers
    • Team Profile
  • Blog
  • Menu Menu
You are here: Home1 / Blog2 / 2ndQuadrant3 / PostgreSQL vs. Linux kernel versions
Tomas Vondra

PostgreSQL vs. Linux kernel versions

August 8, 2016/14 Comments/in 2ndQuadrant, Featured, PostgreSQL, Tomas' PlanetPostgreSQL /by Tomas Vondra

I’ve published multiple benchmarks comparing different PostgreSQL versions, as for example the performance archaeology talk (evaluating PostgreSQL 7.4 up to 9.4), and all those benchmark assumed fixed environment (hardware, kernel, …). Which is fine in many cases (e.g. when evaluating performance impact of a patch), but on production those things do change over time – you get hardware upgrades and from time to time you get an update with a new kernel version.

For hardware upgrades (better storage, more RAM, faster CPUs, …), the impact is usually fairly easy to predict, and moreover people generally realize they need to assess the impact by analyzing the bottlenecks on production and perhaps even testing the new hardware first.

But for what about kernel updates? Sadly we usually don’t do much benchmarking in this area. The assumption is mostly that new kernels are better than older ones (faster, more efficient, scale to more CPU cores). But is it really true? And how big is the difference? For example what if you upgrade a kernel from 3.0 to 4.7 – will that affect the performance, and if yes, will the performance improve or not?

From time to time we get reports about serious regressions with a particular kernel version, or sudden improvement between kernel versions. So clearly, kernel versions may affects performance.

I’m aware of a single PostgreSQL benchmark comparing different kernel versions, made in 2014 by Sergey Konoplev in response to recommendations to avoid 3.0 – 3.8 kernels. But that benchmark is fairly old (the last kernel version available ~18 months ago was 3.13, while nowadays we have 3.19 and 4.6), so I’ve decided to run some benchmarks with current kernels (and PostgreSQL 9.6beta1).

PostgreSQL vs. kernel versions

But first, let me discuss some significant differences between policies governing commits in the two projects. In PostgreSQL we have the concept of major and minor versions – major versions (e.g. 9.5) are released roughly once a year, and include various new features. Minor versions (e.g. 9.5.2) only include bugfixes, and are released about every three months (or more frequently, when a serious bug is discovered). So there should be no major performance or behavior changes between minor versions, which makes it fairly safe to deploy minor versions without extensive testing.

With kernel versions, the situation is much less clear. Linux kernel also has branches (e.g. 2.6, 3.0 or 4.7), those are by no means equal to “major versions” from PostgreSQL, as they continue to receive new features and not just bugfixes. I’m not claiming that the PostgreSQL versioning policy is somehow automatically superior, but the consequence is that updating between minor kernel versions may easily significantly affect performance or even introduce bugs (e.g. 3.18.37 suffers by OOM issues due to a such non-bugfix commit).

Of course, distributions realize these risks, and often lock the kernel version and do further testing to weed out new bugs. This post however uses vanilla longterm kernels, as available on www.kernel.org.

Benchmark

There are many benchmarks we might use – this post presents a suite of pgbench tests, i.e. a fairly simple OLTP (TPC-B-like) benchmark. I plan to do additional tests with other benchmark types (particularly DWH/DSS-oriented), and I’ll present them on this blog in the future.

Now, back to the pgbench – when I say “collection of tests” I mean combinations of

  • read-only vs. read-write
  • data set size – active set does (not) fit into shared buffers / RAM
  • client count – single client vs. many clients (locking/scheduling)

The values obviously depend on the hardware used, so let’s see what hardware this round of benchmarks was running on:

  • CPU: Intel i5-2500k @ 3.3 GHz (3.7 GHz turbo)
  • RAM: 8GB (DDR3 @ 1333 MHz)
  • storage: 6x Intel SSD DC S3700 in RAID-10 (Linux sw raid)
  • filesystem: ext4 with default I/O scheduler (cfq)

So it’s the same machine I’ve used for a number of previous benchmarks – a fairly small machine, not exactly the newest CPU etc. but I believe it’s still a reasonable “small” system.

The benchmark parameters are:

  • data set scales: 30, 300 and 1500 (so roughly 450MB, 4.5GB and 22.5GB)
  • client counts: 1, 4, 16 (the machine has 4 cores)

For each combination there were 3 read-only runs (15-minute each) and 3 read-write runs (30-minute each). The actual script driving the benchmark is available here (along with results and other useful data).

Note: If you have significantly different hardware (e.g. rotational drives), you may see very different results. If you have a system that you’d like to test, let me know and I’ll help you with that (assuming I’ll be allowed to publish the results).

Kernel versions

Regarding kernel versions, I’ve tested the latest versions in all longterm branches since 2.6.x (2.6.39, 3.0.101, 3.2.81, 3.4.112, 3.10.102, 3.12.61, 3.14.73, 3.16.36, 3.18.38, 4.1.29, 4.4.16, 4.6.5 and 4.7). There’s still a lot of systems running on 2.6.x kernels, so it’s useful to know how much performance you might gain (or lose) by upgrading to a newer kernel. But I’ve been compiling all the kernels on my own (i.e. using vanilla kernels, no distribution-specific patches), and the config files are in the git repository.

Results

As usual, all the data is available on bitbucket, including

  • kernel .config file
  • benchmark script (run-pgbench.sh)
  • PostgreSQL config (with some basic tuning for the hardware)
  • PostgreSQL logs
  • various system logs (dmesg, sysctl, mount, …)

The following charts show the average tps for each benchmarked case – the results for the three runs are fairly consistent, with ~2% difference between min and max in most cases.

read-only

For the smallest data set, there’s a clear performance drop between 3.4 and 3.10 for all client counts. The results for 16 clients (4x the number of cores) however more than recovers in 3.12.

pgbench-read-only-small

For the medium data set (fits into RAM but not into shared buffers), we can see the same drop between 3.4 and 3.10 but not the recovery in 3.12.

pgbench-read-only-medium

For large data sets (exceeding RAM, so heavily I/O-bound), the results are very different – I’m not sure what happened between 3.10 and 3.12, but the performance improvement (particularly for higher client counts) is quite astonishing.

pgbench-read-only-large

read-write

For the read-write workload, the results are fairly similar. For the small and medium data sets we can observe the same ~10% drop between 3.4 and 3.10, but sadly no recovery in 3.12.

pgbench-read-write-small

pgbench-read-write-medium

For the large data set (again, significantly I/O bound) we can see similar improvement in 3.12 (not as significant as for the read-only workload, but still significant):

pgbench-read-write-large

Summary

I don’t dare to draw conclusions from a single benchmark on a single machine, but I think it’s safe to say:

  • The overall performance is fairly stable, but we can see some significant performance changes (in both directions).
  • With data sets that fit into memory (either into shared_buffers or at least into RAM) we see a measurable performance drop between 3.4 and 3.10. On read-only test this partially recovers in 3.12 (but only for many clients).
  • With data sets exceeding memory, and thus primarily I/O-bound, we don’t see any such performance drops but instead a significant improvement in 3.12.

As for the reasons why those sudden changes happen, I’m not quite sure. There are many possibly-relevant commits between the versions, but I’m not sure how to identify the correct one without extensive (and time consuming) testing. If you have other ideas (e.g. are aware of such commits), let me know.

Tags: kernel, Linux, performance, pgbench, postgres, read-only, read-write, ssd, version
Share this entry
  • Share on Facebook
  • Share on Twitter
  • Share on WhatsApp
  • Share on LinkedIn
14 replies
  1. vdp
    vdp says:
    August 8, 2016 at 4:41 pm

    s/PostgreSQL kernel also has branches/The Linux kernel also has branches/

    Reply
    • Tomas Vondra
      Tomas Vondra says:
      August 8, 2016 at 7:21 pm

      Thanks, fixed.

      Reply
  2. jstultz
    jstultz says:
    August 9, 2016 at 7:34 am

    Git bisection will narrow down a regression between any two commits fairly fast.

    Reply
    • Tomas Vondra
      Tomas Vondra says:
      August 9, 2016 at 2:05 pm

      Well, I’ve never bisected kernel, but I’d expect that to be a bit more complicated thanks to the reboots. Or is there a better way?

      Reply
  3. TheBB
    TheBB says:
    August 9, 2016 at 8:15 am

    Seems like both the large and medium dataset charts are titled “Medium”, in both parts.

    Reply
    • Tomas Vondra
      Tomas Vondra says:
      August 9, 2016 at 2:04 pm

      D’oh! Fixed, but it’ll take a few minutes until the images expire from caches/CDN.

      Reply
  4. Sergey
    Sergey says:
    August 9, 2016 at 10:22 am

    What glibc version do you use?

    Reply
    • Tomas Vondra
      Tomas Vondra says:
      August 9, 2016 at 2:00 pm

      sys-libs/glibc-2.22-r4:2.2::gentoo

      Reply
  5. mbanck
    mbanck says:
    August 9, 2016 at 12:39 pm

    Even more helpful would be also including vendor kernels, like the SLES10, SLES11, RHEL6, RHEL7 kernels, which are heavily patched, but based off older Versions.

    I guess that is difficult to automate/getting your hands on though?

    Reply
    • Tomas Vondra
      Tomas Vondra says:
      August 9, 2016 at 1:59 pm

      I agree. Sadly I don’t have enough machines to do all the tests I’d like :-/

      But if you have a suitable machine, running the benchmarks is fairly simple (a single shell script will do the trick).

      Reply
  6. Ivory Lockhart
    Ivory Lockhart says:
    August 24, 2016 at 8:12 am

    That last two kernels huge reads behaviour difference looks pretty interesting, doesn’t it? But that’s still not it.

    Reply
    • Tomas Vondra
      Tomas Vondra says:
      August 24, 2016 at 9:27 am

      I don’t follow. Where do you see huge difference in read behavior for the last two kernels? I don’t see anything like that on any of the charts.

      Reply
  7. Laurence "GreenReaper" Parry
    Laurence "GreenReaper" Parry says:
    August 30, 2016 at 9:07 am

    I guess this post was triggered by the mention of changes due to PostgreSQL in 4.7…
    https://kernelnewbies.org/Linux_4.7#head-f53f61733cb5cd6ca19e85721ee0fac195b40d61

    Perhaps this commit in Linux 3.11 is relevant to the speedup?
    http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=d3922a777f9b4c4df898d326fa940f239af4f9b6

    It is mentioned that the ext4 extent cache shrink mechanism, “introduced in kernel 3.9”, had a scalability problem:
    https://kernelnewbies.org/Linux_3.11#head-c2b933384f7efdc5b1a26936a92ad8ff2c4413e7

    Extent tree leaf block caching in 3.12 strikes me as another possibility:
    http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=107a7bd31ac003e42c0f966aa8e5b26947de6024

    But as you say, hard to tell for sure; it could easily be something in memory management or the CPU scheduling.

    Reply
    • Tomas Vondra
      Tomas Vondra says:
      September 1, 2016 at 10:32 pm

      Thanks for the ideas which patches might be causing the speedup between 3.10 and 3.12! Will look into that.

      BTW no, the benchmarking was not triggered by the 4.7 changes (I wasn’t aware of them back in May when starting with this, as there was no 4.7 kernel back then).

      Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

Get in touch with us!

Recent Posts

  • Random Data December 3, 2020
  • Webinar: COMMIT Without Fear – The Beauty of CAMO [Follow Up] November 13, 2020
  • Full-text search since PostgreSQL 8.3 November 5, 2020
  • Random numbers November 3, 2020
  • Webinar: Best Practices for Bulk Data Loading in PostgreSQL [Follow Up] November 2, 2020

Featured External Blogs

Tomas Vondra's Blog

Our Bloggers

  • Simon Riggs
  • Alvaro Herrera
  • Andrew Dunstan
  • Craig Ringer
  • Francesco Canovai
  • Gabriele Bartolini
  • Giulio Calacoci
  • Ian Barwick
  • Marco Nenciarini
  • Mark Wong
  • Pavan Deolasee
  • Petr Jelinek
  • Shaun Thomas
  • Tomas Vondra
  • Umair Shahid

PostgreSQL Cloud

2QLovesPG 2UDA 9.6 backup Barman BDR Business Continuity community conference database DBA development devops disaster recovery greenplum Hot Standby JSON JSONB logical replication monitoring OmniDB open source Orange performance PG12 pgbarman pglogical PG Phriday postgres Postgres-BDR postgres-xl PostgreSQL PostgreSQL 9.6 PostgreSQL10 PostgreSQL11 PostgreSQL 11 PostgreSQL 11 New Features postgresql repmgr Recovery replication security sql wal webinar webinars

Support & Services

24/7 Production Support

Developer Support

Remote DBA for PostgreSQL

PostgreSQL Database Monitoring

PostgreSQL Health Check

PostgreSQL Performance Tuning

Database Security Audit

Upgrade PostgreSQL

PostgreSQL Migration Assessment

Migrate from Oracle to PostgreSQL

Products

HA Postgres Clusters

Postgres-BDR®

2ndQPostgres

pglogical

repmgr

Barman

Postgres Cloud Manager

SQL Firewall

Postgres-XL

OmniDB

Postgres Installer

2UDA

Postgres Learning Center

Introducing Postgres

Blog

Webinars

Books

Videos

Training

Case Studies

Events

About Us

About 2ndQuadrant

What does 2ndQuadrant Mean?

News

Careers 

Team Profile

© 2ndQuadrant Ltd. All rights reserved. | Privacy Policy
  • Twitter
  • LinkedIn
  • Facebook
  • Youtube
  • Mail
PostgreSQL Solutions Roadmap BDR 1.0
Scroll to top
×