Up to date access to postgres logs
Some of my Italian colleagues have made a nifty little gadget called redislog for pushing postgres logs into Redis, the distributed in-memory cache. From there it can be fed into things like logstash. I thought it would be interesting instead to make the logs available via the Redis Foreign Data Wrapper as a Postgres table. That way we would have easy access to the running logs from Postgres with almost no effort. Here’s what I did.
First I built and installed redislog
and redis_fdw
. Then I added redislog
to my server’s shared_preload_libraries
, set log_min_duration_statement
to 0 and restarted.
Then I created a database called logger and did this in it:
create extension redis_fdw; create server localredis foreign data wrapper redis_fdw; create foreign table redis_postgres_log( log_entry json ) server localredis options (tabletype 'list', singleton_key 'postgres', database '0'); create type log_type as ( user_name text, database_name text, process_id int, remote_host text, session_id text, session_line_num int, command_tag text, session_start_time timestamptz, virtual_transaction_id text, transaction_id text, error_severity text, sql_state_code text, detail_log text, detail text, hint text, internal_query text, internal_query_pos text, context text, query text, query_pos text, file_location text, application_name text, message text, "@timestamp" timestamptz); create view postgres_log as select (x).* from redis_postgres_log r, json_populate_record(NULL::log_type, r.log_entry) as x; alter table postgres_log rename "@timestamp" to log_timestamp;
After that I could select a random row from the view:
logger=# select * from public.postgres_log offset 10000 limit 1; -[ RECORD 1 ]----------+----------------------------------------------------------------------------------------------------- user_name | andrew database_name | pgb3 process_id | 27513 remote_host | [local] session_id | 590b44c4.6b79 session_line_num | 1936 command_tag | UPDATE session_start_time | 2017-05-04 11:12:04-04 virtual_transaction_id | 3/2498 transaction_id | 3216 error_severity | LOG sql_state_code | detail_log | detail | hint | internal_query | internal_query_pos | context | query | query_pos | file_location | application_name | pgbench message | duration: 0.235 ms statement: UPDATE pgbench_tellers SET tbalance = tbalance + -4736 WHERE tid = 8; log_timestamp | 2017-05-04 11:12:09.593-04
There’s a bunch of work to do to make this more scalable, for example by partitioning the log. It might also be that we need to enhance redislog and/or redis_fdw to make this work better. But in principle this is a pretty nice result, a painless way of getting up to the second log entries as a postgres table.
Does this perform better than using the file FDW to suck in the CSV file directly from the log directory?
Probably not. But I was interested to see how well it might work anyway, since redislog is also useful for other purposes. Also, I have seen cases where extraneous matter seems to have got into CSV logs making then unreadable by the File FDW. Redislog is pretty much guaranteed to have nothing but JSON.