PostgreSQL Performance: Measure First, Tune Second

Quick take

Install pg_stat_statements. Find your top 10 slowest queries. Fix the indexes. Tune autovacuum for big tables. Use a connection pooler. That covers 80% of Postgres performance work. Everything else is refinement.

I’ve tuned Postgres at startups with a single database and at enterprises with dozens of replicas. The pattern is always the same: someone reports “the database is slow,” and the first instinct is to increase instance size or tweak configuration parameters. Almost every time, the actual problem is a missing index or a query that shouldn’t exist in its current form.

The discipline is simple. Measure, change one thing, measure again. If you skip the measurement step, you’re guessing. Guessing is expensive.

Start with pg_stat_statements

This is the single most valuable tool for Postgres performance work. It tracks every query the database executes and accumulates statistics: call count, total time, mean time, rows returned.

CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Top 10 queries by total execution time
SELECT
    query,
    calls,
    round(total_exec_time::numeric, 2) AS total_ms,
    round(mean_exec_time::numeric, 2) AS avg_ms,
    rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

Total time matters more than average time. A query that runs in 5ms but gets called 500,000 times per day consumes more resources than a query that takes 2 seconds but runs once an hour. Fix the high-total-time queries first.

Also check for long-running queries and idle transactions, which block vacuum and hold locks:

SELECT pid, now() - pg_stat_activity.query_start AS duration, query, state
FROM pg_stat_activity
WHERE (now() - pg_stat_activity.query_start) > interval '5 minutes'
AND state != 'idle'
ORDER BY duration DESC;

EXPLAIN is your microscope

Once you know which queries are slow, EXPLAIN (ANALYZE, BUFFERS) shows you why. Don’t read the entire plan tree. Focus on four signals:

Sequential scan on a large table: Almost always means a missing or ineffective index. A sequential scan on a 100-row lookup table is fine. A sequential scan on a 50-million-row orders table isn’t.

Actual rows far from estimated rows: Postgres is making bad decisions because its statistics are stale or the data is heavily skewed. Run ANALYZE on the table and check again.

Nested loop with high row count on the inner side: Missing index on the join column. The planner is scanning the inner table for every row of the outer table.

Sort with “Sort Method: external merge Disk”: The sort spilled to disk because work_mem was too small for the operation. Either add an index that eliminates the sort, or increase work_mem (carefully – it’s per-operation, not per-connection).

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT o.id, o.total, c.name
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.status = 'pending'
AND o.created_at > now() - interval '7 days'
ORDER BY o.created_at DESC;

Indexing: the right index, not more indexes

Most performance fixes come down to having the right index. Not more indexes. Every index you add slows down writes and makes vacuum work harder.

Composite indexes: Column order matters. Put the equality columns first, range columns last. For the query above, (status, created_at) lets Postgres seek to status = 'pending' and then scan the date range efficiently.

CREATE INDEX idx_orders_status_created ON orders (status, created_at DESC);

Partial indexes: If you only ever query pending orders, index only those. Smaller index, faster lookups, less write overhead.

CREATE INDEX idx_orders_pending ON orders (created_at DESC)
WHERE status = 'pending';

Covering indexes (Postgres 11+): Include columns the query needs so Postgres can answer from the index alone without touching the table.

CREATE INDEX idx_orders_covering ON orders (customer_id, status)
INCLUDE (total, created_at);

Unused indexes: They cost writes and vacuum time for zero benefit. Find them:

SELECT
    schemaname || '.' || relname AS table,
    indexrelname AS index,
    idx_scan AS scans,
    pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexrelname NOT LIKE '%_pkey'
ORDER BY pg_relation_size(indexrelid) DESC;

If an index has zero scans and has been running for weeks, it’s a candidate for removal.

Configuration: what actually matters

I see teams spend hours tweaking parameters that move the needle by 1%. Focus on the ones that matter.

shared_buffers: Postgres’s own cache. Start at 25% of RAM. Going higher rarely helps because the OS page cache handles the rest.

effective_cache_size: Not an allocation – a planner hint. Tell it how much cache is available total (Postgres cache + OS cache). Typically 50-75% of RAM. This changes whether the planner favors index scans over sequential scans.

work_mem: Memory per sort/hash operation. The trap: this is per-operation, not per-connection, and a single query can have multiple sorts. If you set it to 256MB and have 100 concurrent connections each running a multi-sort query, you can run out of memory fast. Start at 4-16MB and increase carefully for specific workloads.

maintenance_work_mem: Used for vacuum and index creation. Higher is better here. 512MB to 1GB is reasonable for most production systems.

max_connections: Keep it low. 100-200 is plenty for most workloads. Use PgBouncer for connection pooling. Each connection costs ~10MB of memory and context-switching overhead. I’ve seen teams set this to 1000 and then wonder why their 64GB instance is swapping.

Connection pooling: non-negotiable

PgBouncer in transaction pooling mode is the standard answer. The application thinks it has a dedicated connection. PgBouncer multiplexes to a smaller pool of real database connections.

This matters because most web application connections are idle most of the time. An application with 200 worker processes might only have 20 queries running at any given moment. PgBouncer lets you serve 200 workers with 30-50 actual database connections.

Session pooling is necessary if your application uses prepared statements, temp tables, or session-level settings (like SET search_path). Transaction pooling doesn’t preserve these across requests.

Vacuum: the silent killer

Postgres uses MVCC, which means every update and delete creates dead tuples. Vacuum cleans them up. If vacuum falls behind, tables bloat, indexes get slower, and eventually you hit transaction ID wraparound – which forces a stop-the-world vacuum.

Autovacuum works fine for most tables. But large, busy tables need tuned thresholds:

-- Check dead tuple counts
SELECT relname, n_live_tup, n_dead_tup,
       round(n_dead_tup::numeric / greatest(n_live_tup, 1) * 100, 2) AS dead_pct,
       last_autovacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > 10000
ORDER BY n_dead_tup DESC;

For a table with 100 million rows, the default autovacuum threshold of autovacuum_vacuum_scale_factor = 0.2 means vacuum waits until there are 20 million dead tuples. That’s way too late. Set per-table overrides:

ALTER TABLE large_events SET (
    autovacuum_vacuum_scale_factor = 0.01,
    autovacuum_analyze_scale_factor = 0.005
);

The other vacuum killer is long-running transactions. A transaction that has been open for hours prevents vacuum from cleaning up any tuples visible to that transaction. Find and fix these aggressively.

The tuning loop

Performance tuning isn’t a project. It’s a practice.

Check pg_stat_statements weekly. Know your top 10.
EXPLAIN any query that shows up as a problem.
Add or adjust indexes. Remove unused ones.
Monitor vacuum health and dead tuple counts.
Repeat.

Small, measured changes. Every time. The teams that keep Postgres fast aren’t the ones with the cleverest configuration. They’re the ones who look at the numbers regularly and fix what the numbers show them.