Studying Stored Procs in Postgres 11

With Postgres 11 looming on the near horizon, it’s only appropriate to check out a recent beta and kick the tires a few times. Whether it’s improvements in parallelism, partitions, stored procedures, JIT functionality, or any number of elements in the release page, there’s a lot to investigate.

It just so happens that I ran across a fortuitous event on Twitter when deciding on an appropriate topic. Behold!

Wasting XIDs has never been this fun

Wait! No! That’s not what stored procedures are for!

I felt so good like anything was possible

When confronted with such a blatant corruption of such a cool new feature, it’s only natural to question the wisdom of doing so. It is, after all, not a great idea to programatically consume transaction IDs. I said as much and moved on with life, certain the worst was over.

Then this happened.

Magnus helpfully recommends proceeding

Now, Magnus is infamous for two things: his technical acumen, and giddy malevolence. His advocacy of a stored procedure named “waste_xid” only proved nobody anywhere should ever run this anywhere, lest they immolate whatever system hosted the database instance.

But hey, VMs are cheap; let’s break things. How else can we learn the limits of our new toys, but by virtually atomizing them?

I hit cruise control and rubbed my eyes

Before we saddle our poor innocent Postgres 11 installation with an inadvisable stored procedure designed specifically to underhandedly reap its transaction lifespan, we should probably make the routine as evil as possible.

One thing stands out immediately: calling EXECUTE is unnecessary overhead. According to the information function documentation, txid_current will assign a new transaction ID if there isn’t one already. Since the stored procedure is constantly committing, that’s extremely handy. And since this is Pl/pgSQL, we can use direct assignment instead.

Our new procedure looks something like this:

CREATE OR REPLACE PROCEDURE waste_xid(cnt int)
AS $$
DECLARE
    i INT;
    x BIGINT;
BEGIN
    FOR i in 1..cnt LOOP
        x := txid_current();
        COMMIT;
    END LOOP;
END;
$$
LANGUAGE plpgsql;

Great! We can now waste XIDs about 4x faster than before! On this particular test VM, the maximum amount of wasted XIDs per second was about 125k. At that rate, we could blow through a billion in roughly two hours.

Since quite a few installations use the default autovacuum settings, that means we could trigger a lot of vacuums at the 200M transaction mark in about half an hour. We didn’t need that disk throughput anyway.

Workin’ on a mystery, goin’ wherever it leads

Regardless, what happens if we run it?

CALL waste_xid(1000000);

CALL

Well, that was anticlimactic. How about the logs? We set our min_log_duration_statement to 100ms for debugging purposes, so we should definitely see a long call there.

2018-10-03 20:53:02.505 UTC [11334] postgres@test LOG:  duration: 7949.540 ms  statement: call waste_xid(1000000);

Here’s something that’s actually more interesting than it might appear at first glance. Stored procedures, it would seem, are atomic. We tested explicitly by adding a long pg_sleep call in the loop, and none of those entries were logged.

Since procedures can COMMIT transactions and thus could contain semantically unrelated unrelated activity to the originating transaction, can a procedure essentially derail an existing transaction? Let’s see:

BEGIN;

CALL waste_xid(1000000);

ERROR:  invalid transaction termination
CONTEXT:  PL/pgSQL function waste_xid(integer) line 8 at COMMIT

Nope! Apparently CALL is only valid within its own context; no transactions for you. This could introduce interesting behavior in application layers that implicitly start a transaction before every command. The resulting error is a bit ambiguous however; why is the transaction termination invalid? Can’t it just say we can’t close the current transaction from within a procedure?

So stored procedures are atomic, and can do whatever they want in their own transactional context. What else are they hiding?

I put the pedal down to make some time

It seems there are special rules to how transactions are handled in stored procedures. While the discussion was hot, a couple more functional variants popped up. Here’s an interesting one:

CREATE OR REPLACE PROCEDURE check_now()
AS $$
DECLARE
    i int;
BEGIN
    FOR i in 1..5 LOOP
        RAISE NOTICE 'It is now: %', now();
        PERFORM txid_current();
        COMMIT;
        PERFORM pg_sleep(0.1);
    END LOOP;
END;
$$
LANGUAGE plpgsql;

What are we testing for here? Well, one fundamental attribute of the now() function and several other related DATETIME routines, is that they are tied to the current transaction. So in theory, the value being returned should change.

Let’s try it:

CALL check_now();

NOTICE:  It is now: 2018-10-03 21:27:07.764822+00
NOTICE:  It is now: 2018-10-03 21:27:07.764822+00
NOTICE:  It is now: 2018-10-03 21:27:07.764822+00
NOTICE:  It is now: 2018-10-03 21:27:07.764822+00
NOTICE:  It is now: 2018-10-03 21:27:07.764822+00

Uh oh. What does this actually mean? Is now() being tethered to the initial CALL instead of the current transaction? Or are PL/pgSQL transactions not real transactions? If they do operate in a different context than the outer transaction, how else does that context diverge? Or is this a bug?

Postgres 11 is still in beta, after all.

It was always cold, no sunshine

Upon adopting the PROCEDURE nomenclature, we’re also presented with a rather unique historical quirk. Consider the previous syntax for creating a trigger. More than a few triggers have been declared this way:

CREATE TRIGGER t_my_trigger
BEFORE INSERT OR UPDATE ON mytable
   FOR EACH ROW
       EXECUTE PROCEDURE my_function();

Ever prescient, the devs saw this coming and added a disclaimer to the Postgres 11 docs:

In the syntax of CREATE TRIGGER, the keywords FUNCTION and PROCEDURE are equivalent, but the referenced function must in any case be a function, not a procedure. The use of the keyword PROCEDURE here is historical and deprecated.

So in case you’d mentally linked the historical—and up until Postgres 11, current—syntax with the new stored procedure functionality, don’t. Procedures really did change more of the Postgres internals than a cursory examination could reveal.

There’s something good waitin’ down this road

Now that we’ve kicked the proverbial tires and explored a few of the neat quirks stored procedures offer, what does it all mean?

First and foremost, the ability to COMMIT or ROLLBACK within a function call allows both safeguarding activity batches or particularly critical data. No longer will one function call represent an all-or-nothing proposition. That alone is a massive enhancement to Postgres, one which users of other database engines such as Oracle have been clamoring for over the last 20 years.

More subtly, it presents a new security concern that administrators should consider. The ability to manipulate transactions is a powerful lever ripe for abuse. At minimum, DBAs should consider revoking USAGE on any languages from users that shouldn’t be granted that capability.

REVOKE USAGE ON LANGUAGE plpgsql FROM PUBLIC;

Users will still be able to invoke previously defined functions or procedures, but won’t be able to create new ones. In fact, this should probably be done on all current systems as a matter of course.

Judiciously wielded, Postgres can continue runnin’ down the dream of becoming the best and most feature-laden database. We just need to pay attention to where we’re going.

Bruce Momjian says:

October 12, 2018 at 8:13 pm

Uh, Sean, did you see the thread about advancing transaction_timestamp during commits within procedures?

https://www.postgresql.org/message-id/[email protected]

The behavior will change in PG 11 final.

Shaun Thomas says:
October 19, 2018 at 3:06 pm

Doh!

Thanks Bruce. 🙂

Pius says:

October 17, 2018 at 7:38 am

Thanks Shaun, that’s a very useful analysis of the not so salient aspects of the new procedure feature.

Michael says:

October 18, 2018 at 1:08 am

You crack me up! Always funny, but informative, stuff from you. You go DUDE!

Jorge says:

November 5, 2018 at 9:12 pm

First of all thanks for this post. It’s the best on this topic I’ve read so far. I have a question for you: is there a way to execute a procedure with transaction control using PSQL? According to the documentation, even with AUTOCOMMIT set to off, PSQL issues an implicit BEGIN just before any command that is not already in a transaction block and is not itself a BEGIN or other transaction-control command, nor a command that cannot be executed inside a transaction block such as VACUUM (unfortunately CALL is not treated in the same way as VACCUM). And, as you explained, the invalid transaction termination error happens because it is not possible to close the current transaction (initiated by PSQL) from within the procedure.

craig.ringer says:
November 13, 2018 at 2:21 am

In PostgreSQL 10 and older, all “procedures” are functions, and run in implicit transaction blocks.

In PostgreSQL 11, true procedures were added. They can do transaction management. See https://blog.2ndquadrant.com/postgresql-11-server-side-procedures-part-2/

2ndQuadrant is now part of EDB

PG Phriday: Studying Stored Procedures in Postgres 11

Studying Stored Procs in Postgres 11

I felt so good like anything was possible

I hit cruise control and rubbed my eyes

Workin’ on a mystery, goin’ wherever it leads

I put the pedal down to make some time

It was always cold, no sunshine

There’s something good waitin’ down this road

Leave a Reply

Leave a Reply Cancel reply

Support & Services

Products

Postgres Learning Center

About Us