Postgres Analysis
Postgres Queries
Here are some queries I use when I have to analyze issues or optimize PostgreSQL. I'll keep this post updated.
Investigating vacuum/analyze and indices
Last auto(vacuum|analyze)
select relname, last_vacuum, last_autovacuum, last_analyze, last_autoanalyze from pg_stat_user_tables order by last_autoanalyze;
Number of unvacuumed transaction in a database
If the number of unvacuumed transaction for a database reaches 2^31 - 1'000'000 , PostgreSQL sets the database to readonly. An offline vacuum is then required, which can block the database for hours or days. Check that the age of datfrozenxid stays under 2 billions.
SELECT datname, age(datfrozenxid) FROM pg_database ORDER BY age(datfrozenxid) DESC LIMIT 20;
Check if stats are accurate for a table
If the following stats are very different, stats might not be up-to-date.
Ratio between Heap hits and Heap reads
Should ideally be at least at 99%. If not, it might indicate that the server's memory is too low (not enough cache). A too large shared_buffer can also be the source of a decrease of the ratio and lower performance! https://www.postgresql.org/message-id/5465402F.9030509@fuzzy.cz
SELECT sum(heap_blks_read) as heap_read, sum(heap_blks_hit) as heap_hit, sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) as ratio FROM pg_statio_user_tables;
Ratio between Index hits and Index reads
Should ideally be at least at 99%. If not, it might indicate that the server's memory is too low (not enough cache). A too large shared_buffer can also be the source of a decrease of the ratio and lower performance! https://www.postgresql.org/message-id/5465402F.9030509@fuzzy.cz
SELECT sum(idx_blks_read) as idx_read, sum(idx_blks_hit) as idx_hit, (sum(idx_blks_hit) - sum(idx_blks_read)) / sum(idx_blks_hit) as ratio FROM pg_statio_user_indexes;
Percentage of index usage by table
Table with more than 10000 rows should ideally have 99% index usage. http://www.craigkerstiens.com/2012/10/01/understanding-postgres-performance/
General query:
SELECT relname, 100 * idx_scan / (seq_scan + idx_scan) percent_of_times_index_used, n_live_tup rows_in_table FROM pg_stat_user_tables WHERE seq_scan + idx_scan > 0 ORDER BY n_live_tup DESC;
Filter on < 99%
SELECT relname, 100 * idx_scan / (seq_scan + idx_scan) percent_of_times_index_used, n_live_tup rows_in_table FROM pg_stat_user_tables WHERE seq_scan + idx_scan > 0 AND (100 * idx_scan / (seq_scan + idx_scan)) < 99 ORDER BY n_live_tup DESC;
Filter on < 99% and at least 500 seq_scan (arbitrary)
SELECT relname, 100 * idx_scan / (seq_scan + idx_scan) percent_of_times_index_used, n_live_tup rows_in_table FROM pg_stat_user_tables WHERE seq_scan + idx_scan > 0 AND (100 * idx_scan / (seq_scan + idx_scan)) < 99 AND seq_scan > 500 ORDER BY n_live_tup DESC;
pg_stat_statements
Optimizing checkpoints
I refer to this excellent article: https://blog.2ndquadrant.com/basics-of-tuning-checkpoints/
Locks
The evident reference is https://wiki.postgresql.org/wiki/Lock_Monitoring. This page also has nice queries: https://wiki.postgresql.org/wiki/Lock_dependency_information
The first one I use when I suspect some locks is
SELECT blocked_locks.pid AS blocked_pid, blocked_activity.usename AS blocked_user, blocking_locks.pid AS blocking_pid, blocking_activity.usename AS blocking_user, blocked_activity.query AS blocked_statement, blocking_activity.query AS current_statement_in_blocking_process FROM pg_catalog.pg_locks blocked_locks JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid JOIN pg_catalog.pg_locks blocking_locks ON blocking_locks.locktype = blocked_locks.locktype AND blocking_locks.DATABASE IS NOT DISTINCT FROM blocked_locks.DATABASE AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid AND blocking_locks.pid != blocked_locks.pid JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid WHERE NOT blocked_locks.GRANTED;
From there, I'll very often investigate further with queries on pg_stat_activity.
SELECT * FROM pg_stat_activity WHERE pid = ?;
Waiting queries
Sometimes, you have a query that seems to hang but you don't see any locks. It might be waiting on something (IO for instance).
Before Postgresql 9.6, you can find these queries with:
SELECT pid, datname, usename, now() - query_start AS runtime, waiting, state, query FROM pg_stat_activity WHERE waiting = 'true';
But it becomes very interesting since 9.6 that displays also the reason of the wait:
SELECT pid, datname, usename, now() - query_start AS runtime, wait_event, wait_event_type, state, query FROM pg_stat_activity WHERE wait_event IS NOT NULL AND state = 'active';
Bookmarks about the waiting queries:
- https://www.postgresql.org/docs/10/static/monitoring-stats.html
- https://www.openscg.com/2016/11/is-my-query-stuck-postgresql-9-6/
- http://akorotkov.github.io/blog/2016/03/25/wait_monitoring_9_6/
- http://paquier.xyz/postgresql-2/postgres-9-6-feature-highlight-wait-events/
- https://www.credativ.com/credativ-blog/postgresql-96-feature-pgstatactivity-waitevent