Guewen Baconnier

(ノ◕ヮ◕)ノ*:・゚✧

Postgres Analysis

Postgres Queries

Here are some queries I use when I have to analyze issues or optimize PostgreSQL. I'll keep this post updated.

Investigating vacuum/analyze and indices

Last auto(vacuum|analyze)

select relname, last_vacuum, last_autovacuum, last_analyze, last_autoanalyze from pg_stat_user_tables order by last_autoanalyze;

Number of unvacuumed transaction in a database

If the number of unvacuumed transaction for a database reaches 2^31 - 1'000'000 , PostgreSQL sets the database to readonly. An offline vacuum is then required, which can block the database for hours or days. Check that the age of datfrozenxid stays under 2 billions.

SELECT datname, age(datfrozenxid) 
FROM pg_database ORDER 
BY age(datfrozenxid) DESC LIMIT 20;

Check if stats are accurate for a table

If the following stats are very different, stats might not be up-to-date.

  • Known stats for number of rows
    select n_live_tup from pg_stat_user_tables where relname = 'sale_order';
    
  • To compare with the real count
    select count(*) from sale_order;
    

Ratio between Heap hits and Heap reads

Should ideally be at least at 99%. If not, it might indicate that the server's memory is too low (not enough cache). A too large shared_buffer can also be the source of a decrease of the ratio and lower performance! https://www.postgresql.org/message-id/5465402F.9030509@fuzzy.cz

SELECT
  sum(heap_blks_read) as heap_read,
  sum(heap_blks_hit)  as heap_hit,
  sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) as ratio
FROM
  pg_statio_user_tables;

Ratio between Index hits and Index reads

Should ideally be at least at 99%. If not, it might indicate that the server's memory is too low (not enough cache). A too large shared_buffer can also be the source of a decrease of the ratio and lower performance! https://www.postgresql.org/message-id/5465402F.9030509@fuzzy.cz

SELECT
  sum(idx_blks_read) as idx_read,
  sum(idx_blks_hit)  as idx_hit,
  (sum(idx_blks_hit) - sum(idx_blks_read)) / sum(idx_blks_hit) as ratio
FROM
  pg_statio_user_indexes;

Percentage of index usage by table

Table with more than 10000 rows should ideally have 99% index usage. http://www.craigkerstiens.com/2012/10/01/understanding-postgres-performance/

General query:

SELECT
  relname,
  100 * idx_scan / (seq_scan + idx_scan) percent_of_times_index_used,
  n_live_tup rows_in_table
FROM
  pg_stat_user_tables
WHERE
    seq_scan + idx_scan > 0
ORDER BY
  n_live_tup DESC;

Filter on < 99%

SELECT
  relname,
  100 * idx_scan / (seq_scan + idx_scan) percent_of_times_index_used,
  n_live_tup rows_in_table
FROM
  pg_stat_user_tables
WHERE
    seq_scan + idx_scan > 0
    AND (100 * idx_scan / (seq_scan + idx_scan)) < 99
ORDER BY
  n_live_tup DESC;

Filter on < 99% and at least 500 seq_scan (arbitrary)

SELECT
  relname,
  100 * idx_scan / (seq_scan + idx_scan) percent_of_times_index_used,
  n_live_tup rows_in_table
FROM
  pg_stat_user_tables
WHERE
    seq_scan + idx_scan > 0 
    AND (100 * idx_scan / (seq_scan + idx_scan)) < 99
    AND seq_scan > 500
ORDER BY
  n_live_tup DESC;

Optimizing checkpoints

Locks

The evident reference is https://wiki.postgresql.org/wiki/Lock_Monitoring. This page also has nice queries: https://wiki.postgresql.org/wiki/Lock_dependency_information

The first one I use when I suspect some locks is

SELECT blocked_locks.pid     AS blocked_pid,
       blocked_activity.usename  AS blocked_user,
       blocking_locks.pid     AS blocking_pid,
       blocking_activity.usename AS blocking_user,
       blocked_activity.query    AS blocked_statement,
       blocking_activity.query   AS current_statement_in_blocking_process
 FROM  pg_catalog.pg_locks         blocked_locks
  JOIN pg_catalog.pg_stat_activity blocked_activity  ON blocked_activity.pid = blocked_locks.pid
  JOIN pg_catalog.pg_locks         blocking_locks 
      ON blocking_locks.locktype = blocked_locks.locktype
      AND blocking_locks.DATABASE IS NOT DISTINCT FROM blocked_locks.DATABASE
      AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
      AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page
      AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple
      AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid
      AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid
      AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid
      AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid
      AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid
      AND blocking_locks.pid != blocked_locks.pid
  JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid
 WHERE NOT blocked_locks.GRANTED;

From there, I'll very often investigate further with queries on pg_stat_activity.

SELECT * FROM pg_stat_activity WHERE pid = ?;

Waiting queries

Sometimes, you have a query that seems to hang but you don't see any locks. It might be waiting on something (IO for instance).

Before Postgresql 9.6, you can find these queries with:

SELECT pid,
datname,
usename,
now() - query_start AS runtime,
waiting,
state,
query
FROM pg_stat_activity
WHERE waiting = 'true'; 

But it becomes very interesting since 9.6 that displays also the reason of the wait:

SELECT 
pid,
datname,
usename,
now() - query_start AS runtime,
wait_event, 
wait_event_type,
state,
query
FROM pg_stat_activity
WHERE wait_event IS NOT NULL
AND state = 'active';

Bookmarks about the waiting queries:

Comments