PostgreSQL 10 identity columns explained
For PostgreSQL 10, I have worked on a feature called “identity columns”. Depesz already wrote a blog post about it and showed that it works pretty much like serial columns:
CREATE TABLE test_old (
id serial PRIMARY KEY,
payload text
);
INSERT INTO test_old (payload) VALUES ('a'), ('b'), ('c') RETURNING *;
and
CREATE TABLE test_new (
id int GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
payload text
);
INSERT INTO test_new (payload) VALUES ('a'), ('b'), ('c') RETURNING *;
do pretty much the same thing, except that the new way is more verbose. 😉
So why bother?
Compatibility
The new syntax conforms to the SQL standard. Creating auto-incrementing columns has been a notorious area of incompatibility between different SQL implementations. Some have lately been adopting the standard SQL syntax, however. So now you can move code around between, for example, PostgreSQL, DB2, and Oracle without any change (in this area).
Permissions
A general problem with the old way is that the system doesn’t actually remember that the user typed serial. It expands this at parse time into something like
CREATE SEQUENCE test_old_id_seq;
CREATE TABLE test_old (
id int NOT NULL PRIMARY KEY,
payload text
);
ALTER TABLE test_old
ALTER COLUMN id SET DEFAULT nextval('test_old_id_seq');
ALTER SEQUENCE test_old_id_seq OWNED BY test_old.id;
The OWNED BY in the last command is an attempt to remember something about the serialness, but it is still insufficient in some cases.
The new way creates the sequence as a proper internal dependency of the table, so that various weird implementation details of the serial pseudotype are not exposed.
One common problem is that permissions for the sequence created by a serial column need to be managed separately:
CREATE USER foo;
GRANT INSERT ON test_old TO foo;
GRANT INSERT ON test_new TO foo;
SET SESSION AUTHORIZATION foo;
INSERT INTO test_old (payload) VALUES ('d');
ERROR: permission denied for sequence test_old_id_seq
INSERT INTO test_new (payload) VALUES ('d');
-- OK
You can fix the error by also running
GRANT USAGE ON SEQUENCE test_old_id_seq;
If you have deployment scripts, this is annoying and problematic, because the name of the sequence is automatically generated. Here, of course, it appears in the error message, and it is easy to guess, but sometimes a slightly different name is chosen, and then your deployment scripts will fail.
Managing sequences
You also need to know the name of the sequence if you want to make some changes to the sequence:
ALTER SEQUENCE test_old_id_seq RESTART WITH 1000;
With an identity column, you don’t need to know the name of the sequence:
ALTER TABLE test_new ALTER COLUMN id RESTART WITH 1000;
Schema management
Since serial is not a real type, it can only be used in certain circumstances. You can specify serial as a column type when creating a table or when adding a column. But dropping serialness from an existing column or adding it to an existing column is not straightforward.
To drop serialness, you can drop the sequence (again, after ascertaining the name) with the CASCADE option, which cascades to remove the default value of the associated column:
DROP SEQUENCE test_old_id_seq CASCADE;
If you instead drop the default value like
ALTER TABLE test_old ALTER COLUMN id DROP DEFAULT;
it will drop the default but leave the sequence in place.
If you want to take an existing integer column and turn it into a serial column, there is no single command to do that. You will have to manually assemble the CREATE SEQUENCE and ALTER TABLE ... SET DEFAULT commands shown earlier.
Dropping the identity property of an existing column is easy:
ALTER TABLE test_new ALTER COLUMN id DROP IDENTITY;
You cannot accidentally make a mistake and drop a default, because there is none. But just in case, you get a nice error message:
=> ALTER TABLE test_new ALTER COLUMN id DROP DEFAULT; ERROR: column "id" of relation "test_new" is an identity column HINT: Use ALTER TABLE ... ALTER COLUMN ... DROP IDENTITY instead.
You can also turn an existing integer column into an identity column with one command:
ALTER TABLE test_new
ALTER COLUMN id ADD GENERATED BY DEFAULT AS IDENTITY;
Copying table structures
If you use the CREATE TABLE / LIKE functionality to copy the structure of a table, serial columns pose a problem:
CREATE TABLE test_old2 (LIKE test_old INCLUDING ALL);
INSERT INTO test_old2 (payload) VALUES ('e') RETURNING *;
id | payload
----+---------
4 | e
Note that even though the new table is a separate table, it keeps using the old sequence.
This gets even more confusing when you want to drop the first table:
=> DROP TABLE test_old; ERROR: cannot drop table test_old because other objects depend on it DETAIL: default for table test_old2 column id depends on sequence test_old_id_seq HINT: Use DROP ... CASCADE to drop the dependent objects too.
(You can use CASCADE as hinted, or drop test_old2 first. The latter works without CASCADE because the sequence is still linked to the first table.)
When you copy a table with an identity column in this way, you get a new sequence:
CREATE TABLE test_new2 (LIKE test_new INCLUDING ALL);
INSERT INTO test_new2 (payload) VALUES ('e') RETURNING *;
id | payload
----+---------
1 | e
Upgrading
Perhaps you are convinced and you want to “upgrade” all your messy serial columns to this new identity column thing. (Note that you don’t have to “upgrade”. You can keep using serial columns the same way as before.) Here is a PL/pgSQL function that you can use:
CREATE OR REPLACE FUNCTION upgrade_serial_to_identity(tbl regclass, col name)
RETURNS void
LANGUAGE plpgsql
AS $$
DECLARE
colnum smallint;
seqid oid;
count int;
BEGIN
-- find column number
SELECT attnum INTO colnum FROM pg_attribute WHERE attrelid = tbl AND attname = col;
IF NOT FOUND THEN
RAISE EXCEPTION 'column does not exist';
END IF;
-- find sequence
SELECT INTO seqid objid
FROM pg_depend
WHERE (refclassid, refobjid, refobjsubid) = ('pg_class'::regclass, tbl, colnum)
AND classid = 'pg_class'::regclass AND objsubid = 0
AND deptype = 'a';
GET DIAGNOSTICS count = ROW_COUNT;
IF count < 1 THEN
RAISE EXCEPTION 'no linked sequence found';
ELSIF count > 1 THEN
RAISE EXCEPTION 'more than one linked sequence found';
END IF;
-- drop the default
EXECUTE 'ALTER TABLE ' || tbl || ' ALTER COLUMN ' || quote_ident(col) || ' DROP DEFAULT';
-- change the dependency between column and sequence to internal
UPDATE pg_depend
SET deptype = 'i'
WHERE (classid, objid, objsubid) = ('pg_class'::regclass, seqid, 0)
AND deptype = 'a';
-- mark the column as identity column
UPDATE pg_attribute
SET attidentity = 'd'
WHERE attrelid = tbl
AND attname = col;
END;
$$;
Call it like this:
SELECT upgrade_serial_to_identity('test_old', 'id');
If you are using schemas, it would look like this:
SELECT upgrade_serial_to_identity('public.test_old', 'id');
Check your table definition before and after:
=> \d test_old
Table "public.test_old"
Column | Type | Collation | Nullable | Default
---------+---------+-----------+----------+--------------------------------------
id | integer | | not null | nextval('test_old_id_seq'::regclass)
payload | text | | |
=> \d test_old
Table "public.test_old"
Column | Type | Collation | Nullable | Default
---------+---------+-----------+----------+----------------------------------
id | integer | | not null | generated by default as identity
payload | text | | |
Further reading
More information can be found in the PostgreSQL documentation, starting from the CREATE TABLE reference page.




How about inherited tables?
The identity property is not inherited. For a serial column, the default expression is inherited but the sequence ownership is not (similar to the
LIKEcase).Wow, when you see all the weirdness of SERIAL in one place, it is obvious we needed something better.
Great news! Thanks for your work.
One question: Experimenting with the current beta of Postgres 10 shows that `OWNED BY NONE` is allowed as an option to GENERATED…AS IDENTITY. Does that make sense? Seems like the goal of an identity column is to hide the implementation details of the sequence object… so shouldn’t ownership be irrelevant, and therefore `OWNED BY …` and `OWNED BY` should both throw an error?
Example of code that succeeds in beta:
CREATE TABLE IF NOT EXISTS tbl_ (
id_ int GENERATED BY DEFAULT AS IDENTITY ( START WITH 200 MINVALUE 100 MAXVALUE 205 CYCLE INCREMENT BY 3 OWNED BY NONE ) PRIMARY KEY ,
name_ text
);
That appears to be a bug. I’ll look into it.
What is the relationship between GENERATED … AS IDENTITY and PRIMARY KEY? Do we still need to specify PRIMARY KEY to get an index, UNIQUE, and designation for relationships (with foreign keys and JOIN)?
Identity and primary key are separate. So you have to specify both if you want both. An identity column is, however, automatically NOT NULL. (That is per SQL standard.)
I suggest adding a mention of `GENERATED … AS IDENTITY` as a replacement for `SERIAL` in the “Serial Types” section 8.1.4 of the “Data Types” page. Seems a shame to let the naïve reader continue using `SERIAL` without knowing about identity column feature.
https://www.postgresql.org/docs/devel/static/datatype-numeric.html#datatype-serial
I posted this suggestion to the Postgres docs mailing list:
https://www.postgresql.org/message-id/7637CE20-AA33-48C4-AEBA-A0F591F87235%40pobox.com
I have just upgraded to this version and have been testing the IDENTITY column, and I have encountered a problem with pgAdmin. After I have created the table and view its CREATE TABLE script there are no references to the IDENTITY clause or to the related sequence. This means that I cannot copy that script into a file so that I can create the same table in a different environment.
Maybe try https://omnidb.org
To finish the story, the following expression:
currval(pg_get_serial_sequence(‘sometable’, ‘id’));
… can be used to get the newly generated IDENTITY value, just like before with the SERIAL pseudo type.
I have a question about “identity inserts.” If I’m using a SERIAL, I can make a call to SETVAL() like below after explicitly specifying the value for an id.
INSERT INTO test_old (payload) VALUES (‘a’), (‘b’), (‘c’) RETURNING *;
INSERT INTO test_old (id, payload) VALUES (5, ‘a’) RETURNING *;
SELECT SETVAL(‘test_old_id_seq’, 5);
INSERT INTO test_old (payload) VALUES (‘a’), (‘b’), (‘c’) RETURNING *;
What is the proper way of doing this using an IDENTITY column? Do I need to do an ALTER TABLE like the following?
ALTER TABLE test_new ALTER COLUMN id RESTART WITH 5;
I’m wondering if there is a performance hit for having to do an ALTER TABLE versus a SETVAL()?
I have code where a user is passing in data and depending on whether the id value is specified, I optionally call SETVAL() after doing the INSERT. I found that you have to do this. Otherwise, the next value is still whatever it was before. I haven’t noticed performance problems doing this. Having to do an ALTER TABLE is a little scary. Or, is there some other way using SETVAL()?
There is a difference. ALTER TABLE is transactional, setval is not. So it depends on what you want. See the ALTER SEQUENCE man page for details.
Never mind, I just noticed that it appears that it still created a separate sequence for the IDENTITY column. So, I can still do SETVAL() the way I was before.
I thanks for this great article.
I wanted to convert all my serial, but unfortunatly I’m using AWS RDS Postgres and there is no way to “UPDATE pg_depend” there.
Any work arround that you could think about ?
I’m surprised that UPDATE pg_depend is not allowed, but if so, then I don’t know of another way.