Testing new PostgreSQL versions without messing up your existing install
People are often hesitant to test out a new PostgreSQL release because they’re concerned it’ll break their current working installation.
This is a perfectly valid concern, but it’s easily resolved with a few simple protective measures:
- Build PostgreSQL from source as an unprivileged user
- Install your PostgreSQL build within that user’s home directory
- Run PostgreSQL as that user, not postgres
- Run on a non-default port by setting the PGPORT env var
If you take these steps the install its self cannot interfere at all. Starting and running the new PostgreSQL can only interfere by using too many resources (shared memory, file descriptors, RAM, CPU, etc) and you can just stop the new version if it causes any issues. The shared memory improvements in 9.3 make the shared memory issues largely go away, too.
If performance impact is a concern, set a ulimit to stop the new version using too many resources and run it nice‘d and ionice‘d, but I’m not going to get into that in this article, that’s a whole separate topic.
Let’s say you want to test a patch, either for your own use or when you’re doing review work for a commitfest. You’re testing on Linux, OS X, or BSD; Windows isn’t covered here. You’re comfortable with the command line but you might not have much or any prior development/compiling experience. You’re familiar with common PostgreSQL tools like psql, pg_dump and pg_restore.
Here’s how to get started:
1. Install git, gcc, and other tools.
You’ll need a compiler and some libraries to compile PostgreSQL. This is the only step that affects the software running on the rest of your system, and it only installs a few extra tools. It’s possible that it might update existing libraries, but only to the same versions an aptitude upgrade or yum update would, i.e. security and bug-fix releases.
On Debian/Ubuntu, uncomment any deb-src lines in your sources.list then:
sudo aptitude update sudo aptitude install git build-essential gdb linux-tools patch sudo aptitude build-dep postgresql
Fedora / Red Hat
On Fedora / Red Hat:
sudo yum install git yum-utils gdb perf patch sudo yum groupinstall "Development Tools" sudo yum-builddep postgresql
The last step might fail with an error about missing sources; it seems to be common for Red Hat based distros not to configure any sources in yum, even disabled by default. If that’s the case for you, download the rpm spec file from the PGDG repository for your distro and version. The Pg version doesn’t matter much, since you’re only using it to get dependencies. For example, I used the spec file from http://svn.pgrpms.org/browser/rpm/redhat/9.2/postgresql/F-19/postgresql-9.2.spec even though I’ll be building git master (9.4). Once it’s downloaded, tell yum-builddep to install the requirements:
sudo yum-builddep /path/to/postgresql-9.2.spec
Mac OS X users will need to install XCode. To install the required libraries for PostgreSQL you should use a ports-like tool like MacPorts or homebrew.
Comments from Mac users who can provide more specific instructions would be welcomed. Many of the guides and tutorials on the ‘net appear to be outdated or assume you want to use the PostgreSQL sources provided by MacPorts, Homebrew, etc, rather than your own.
Use the ports system to ensure you have gcc and the other dependencies for PostgreSQL. If in doubt look at the portfile for PostgreSQL to see what you need.
Comments from BSD users who can provide more specific instructions would be welcomed.
Compiling PostgreSQL on Windows is totally different. I wrote about that separately but it’s not for the faint of heart. This guide does not apply to Windows, it will only confuse you.
Get the PostgreSQL sources
From this point on you should be working with your normal user account or a separate account created for the purpose. Do not use sudo or run as root.
I recommend grabbing PostgreSQL from git rather than getting a source tarball. It’ll save you time and hassle down the track.
Get the main PostgreSQL git repo first:
git clone git://git.postgresql.org/git/postgresql.git
This will take a while, but you’ll only need to do it once. Future updates can be done with a simple “git pull”.
Now check out the PostgreSQL release branch you want to work with. These follow a strict naming scheme, where REL9_1_STABLE is the 9.1.x branch, REL9_2_STABLE is the 9.2.x branch, etc. Individual releases are tags like REL9_2_1. The current development tip is the default branch, called master. If you’re testing a patch the email the patch came with generally includes information about what revision it applies on top of.
Sometimes you’ll want to test git master or the tip of a stable branch, say when you’re testing out a bug fix. In this case you can skip the next bit and move straight on to compiling and installing.
Get the patch you want to test
Changes to PostgreSQL are typically distributed as patches on the mailing list. Sometimes you’ll find that there’s a git branch published for the patch, but I’ll assume that there isn’t, or that you don’t want to learn git for the purpose. (If you do, start with the git book).
You’ll generally find the patch as an attachment to a mailing list post, or included in-line. Let’s pretend you want to apply this trivial patch, which should apply cleanly to most releases.
That patch is an attachment so save it somewhere, say $HOME/Downloads/parse_bool_with_len.patch. (If a patch is in-line in an email you need to copy and paste it into a text file instead).
Apply the patch to the PostgreSQL sources
Now I want to apply the patch to REL9_2_STABLE. To do that I cd into the postgresql git working tree from the previous step and run:
git checkout REL9_2_STABLE # Update to the latest content of the branch git pull # and apply the patch patch -p1 < ~/Downloads/parse_bool_with_len.patch
The patch should apply cleanly:
$ patch -p1 < ~/Downloads/parse_bool_with_len.patch patching file src/backend/utils/adt/bool.c $
If it reports an error, it’s possible you’ll need to add -l (ignore whitespace) or if it’s a patch created bit git format-patch try using git am -3 to apply it.. I won’t go into resolving conflicts further here, it’s a whole separate topic, we’ll just presume the patch applies cleanly like it should.
If patch reports:
Reversed (or previously applied) patch detected! Assume -R? [n]
then it’s quite likely the patch has been applied to this release already.
Your sources are patched, so you’re ready to compile. I’m going to assume you’re going to install to $HOME/postgresql-test . (In practice I tend to use $HOME/pg/92-parse_bool_with_len or similar, i.e. name the install dir after the patch).
You’re already cd‘d to the postgresql source dir, so:
./configure --prefix=$HOME/postgresql-test make clean make
For information about other options to configure see configure --help and the PostgreSQL documentation.
If you want to run the regression tests you can do so at this point.
PGPORT=5444 make check
Now install the PostgreSQL build to your home dir. Do not use sudo, it is not required:
Congratulations, you compiled and installed PostgreSQL.
You’ll usually want to create a new blank database cluster and start PostgreSQL on it. This is quite trivial:
# Choose an unused port on your machine export PGPORT=5444 export PATH=$HOME/postgresql-test/bin:$PATH initdb -D $HOME/postgresql-test-data pg_ctl -D $HOME/postgresql-test-data -l $HOME/postgresql-test-data.log start
(See the initdb and pg_ctl documentation for details on command line options, etc. The -l flag tells pg_ctl to save the PostgreSQL logs to $HOME/postgresql-test-data.log.)
You can connect to the new server with psql. For other tools you’ll have to specify the port 5444. You’ll notice something odd, though: connecting without specifying a username or database fails with:
$ psql psql: FATAL: database "myusername" does not exist
That’s because by default initdb sets the superuser name to the user you run initdb as, instead of postgres. You can either run initdb with the -U flag to specify a different superuser name, or just explicitly connect to the default postgres database by name:
$ psql postgres psql (9.4devel, server 9.4devel) postgres#
I recommend using a different superuser name to what you have on your live data. It’ll help stop those embarrassing “oops, wrong server” mistakes.
Copying an existing PostgreSQL cluter
Sometimes – like when you’re testing a bugfix or minor point release – you want to copy an existing PostgreSQL database.
The easiest way to do that is pg_dump and pg_restore. This is just like any other dump and restore except that you specify the port of the new server to pg_restore when restoring the database.
Alternately, you can use pg_basebackup with --xlog-method=stream to copy the on-disk format of the existing cluster while it’s running, then use pg_ctl to start the new binaries against the copy. This won’t affect the original data at all. It only works if you’re running the same major release, eg your main DB is on 9.2.1 and you’re testing REL9_2_STABLE. It’s useful to do this when testing bug fixes.
Making sure you don’t connect to the wrong server
You’ve installed a custom built and possibly patched PostgreSQL in your home directory. From here, you can mess with it all you like, safe in the knowledge you can’t break anything important so long as you make sure you connect to the correct database server.
Be careful to use the correct port and check what you’ve connected to or – better – make sure you create a different user on the development copy that doesn’t exist on production so you can’t get the two muddled.
I like to create a separate user account on my laptop that I log into with sudo -i pgdev. This account has a .bash_profile with a PGPORT set and a different shell prompt to indicate I’m in dev mode, a .psqlrc that sets a different psql prompt, etc.
The best isolation is of course a virtual machine running on a different host. With things like EC2 that’s quick and easy these days, so it’s well worth considering especially for short lived test machines where the rather ordinary price/performance ratio isn’t much of a concern.
Where to go from here
Now that you’ve compiled, patched and installed your own copy of PostgreSQL you can get onto the bug testing, benchmarking, patch review, or whatever else bought you here in the first place.
Once you’re done with that, consider taking a look at some of the following materials:
- The PostgreSQL documentation should be your first reference for any problems.
- Reviewing a patch
- The developer pages on postgresql.org
- The Developer FAQ
- The mailing lists, particularly pgsql-hackers
- The Commitfest app
- The git book. Git is a core tool for PostgreSQL development now, and it’s very useful to become familiar with it.
Have fun. It’s not as hard as it looks.
Nice reading for apprentice hackers. make installcheck runs regression tests on a given installation, and you can install contrib modules as well by doing “cd contrib/; make install”.
Thank you so much for the through guide – I do appreciate it.
I am used to setting the default port in the configure line, using –with-pgport=SOMENUMBER
This embeds that port number in the server and all client utilities, so that you don’t have to remember to set the PGPORT environment variable all the time. It’s pretty handy for testing even multiple builds simultaneously … and you don’t have to remember the number, just ensure it’s unique among your builds.
peg does all of the checkout/build steps here: https://github.com/gregs1104/peg and it works on Linux and Mac OS X.
Maybe you don’t know about PGVM .
It’s a nice tool do manage multiple versions of PostgreSQL that can be used for testing patchs too.
I wasn’t familiar with that, and it’s a rather handy looking tool.
To what degree does it support working with feature branches and patches?
Installing the flex package needs to be added for the Debian steps at least, otherwise make complains about not being able to build bootscanner.c.
Thanks, good catch.
It’s probably just as sensible to use a Docker container or something these days.