Databases

Postgres for agents: eleven patterns we ship on every build

A routine VACUUM FULL left our chat agent quoting a deleted policy. The fix was small. The lesson was eleven Postgres patterns we now ship on every agent build.

Jacob Molkenboer· Founder · A Brand New Company· 31 May 2025· 9 min

Open oak card-catalogue drawer with cream index cards, one chartreuse tab raised, brass divider, ivory paper surface.

It is 16:47 on a Friday. Sales has reclaimed 38 GB on the production database by running VACUUM FULL against the embeddings table. Five minutes later, the support agent answers a paying customer with a chunk from a policy document we deleted two months ago. The retrieval is fast. The retrieval is wrong. Nobody can see why.

The chunk text rendered cleanly because the agent had no way of knowing the row underneath it no longer existed in our system of record. The HNSW index inside pgvector still pointed at tuple identifiers that VACUUM FULL had quietly remapped during the rewrite. The query planner trusted the index. The agent trusted the planner. The customer trusted the agent.

We spent the weekend writing down what we wished we had known on Friday afternoon. The post-mortem ran two hours. We logged the failure mode, the recovery, and the rebuild, then we did the harder thing: we listed every Postgres habit we had picked up across the agent projects we have shipped and ranked them by how quickly a database administrator could deploy them without a model-shaped panic. What follows is that list. Eleven patterns, ranked by what a DBA who has never opened a model dashboard can ship on a Tuesday afternoon.

What VACUUM FULL did to the vector index

A regular VACUUM in Postgres marks dead tuples as reusable space inside the same physical file. A VACUUM FULL is a different animal: it acquires an ACCESS EXCLUSIVE lock, rewrites the entire table to a new file, and rebuilds every index from scratch. For a B-tree on a normal column this is annoying but safe. For an HNSW index built on top of pgvector, the rebuild takes minutes, and the graph construction is non-deterministic on duplicates near the decision boundary.

That mismatch is enough for an agent retrieving the top three chunks to surface a result you genuinely no longer have. The pgvector project documents the rebuild semantics in passing, but most operations teams find out about it the way we did: through a customer ticket. You can argue (correctly) that this is a pgvector implementation detail more than a Postgres property. That is true. It is also irrelevant when a customer is on hold. The patterns in the rest of this guide assume that everything will eventually surprise you, and that the database is the only layer in your stack that holds its shape under prompt-shaped pressure.

Patterns one to five, deployable before lunch

1. A statement timeout on the agent role

Give the agent its own Postgres role and cap every statement at a number you can defend in a review. Two and a half seconds is our default. Anything that needs more time should be a job, not an interactive query.

CREATE ROLE agent_runner LOGIN PASSWORD :'pw';
ALTER ROLE agent_runner SET statement_timeout = '2500ms';
ALTER ROLE agent_runner SET idle_in_transaction_session_timeout = '5s';
ALTER ROLE agent_runner SET lock_timeout = '500ms';

This single change protects you from the failure mode where an agent loops a heavy query and runs up a bill nobody is watching for. An agent with no enforced ceiling will find the most expensive thing in your system and do it forever. The database is the right place to draw the line, because the database is the only thing the agent cannot lie its way around.

2. Idempotency keys on every write

Agents retry. Tool calls retry. Webhook deliveries retry. The cheapest defence is a unique index on a client-generated key and an ON CONFLICT DO NOTHING on the insert.

CREATE UNIQUE INDEX agent_runs_idem
  ON agent_runs (tenant_id, idempotency_key);

INSERT INTO agent_runs (tenant_id, idempotency_key, payload)
VALUES ($1, $2, $3)
ON CONFLICT (tenant_id, idempotency_key) DO NOTHING
RETURNING id;

If RETURNING is empty, the agent already wrote that row. Read it back and continue. You will be surprised how often this fires once you start counting, especially on Mondays when the model provider has had a wobble over the weekend.

3. SKIP LOCKED for the work queue

Most teams reach for Redis or SQS the moment an agent needs background work. You do not need to. A queue table with FOR UPDATE SKIP LOCKED handles thousands of jobs per second on a small Postgres instance, and you keep transactional consistency with the rest of your agent state.

SELECT id, payload
FROM agent_jobs
WHERE status = 'queued'
ORDER BY priority DESC, created_at
FOR UPDATE SKIP LOCKED
LIMIT 1;

The pattern is older than most agent stacks and almost embarrassingly reliable. If you outgrow it, you will know.

4. JSONB with jsonb_path_ops

Tool-call arguments and agent traces want to live in jsonb. The default GIN operator class indexes every value and is slower than it needs to be. For the queries an agent app actually runs (containment and key existence), jsonb_path_ops is roughly half the size and noticeably faster.

CREATE INDEX tool_calls_args_gin
  ON tool_calls USING gin (args jsonb_path_ops);

5. LISTEN and NOTIFY for cheap fan-out

When an agent finishes a long task, you usually need to wake up a websocket somewhere. NOTIFY is the lowest-effort tool in the box. Payload is capped at 8000 bytes and delivery is at-most-once, which is exactly the right contract for "go re-render the conversation."

NOTIFY agent_events, 'thread:7d3a:complete';

If you need exactly-once delivery, do not use NOTIFY; write to an outbox table and have the worker poll the outbox. Most agent UI updates do not need exactly-once. Most teams overengineer this.

Patterns six to eight, deployable before tomorrow's standup

6. Partition the run log by month

Agent run logs grow faster than any other table in the system. After three months, you cannot run a simple count without a sequential scan and a bad afternoon. Range-partition by created_at and let the planner prune.

CREATE TABLE agent_runs (
  id          bigserial,
  created_at  timestamptz NOT NULL,
  tenant_id   uuid NOT NULL,
  payload     jsonb NOT NULL
) PARTITION BY RANGE (created_at);

CREATE TABLE agent_runs_2026_06 PARTITION OF agent_runs
  FOR VALUES FROM ('2026-06-01') TO ('2026-07-01');

Pair this with a nightly job that drops partitions older than your retention window. Detach first, drop second; the official Postgres documentation explains why the order matters.

7. pg_stat_statements with a per-role budget

Once you turn on pg_stat_statements, your agent's worst query becomes obvious within a day. Add an internal dashboard that ranks queries by total_exec_time grouped by the role that ran them. The agent that calls SELECT * FROM messages WHERE thread_id = ? three hundred times per conversation will surface itself, and so will the one fetching the full embedding column when it only needs the metadata. The first time you sort by total_exec_time you will find a query you wrote ten minutes ago. Fix it before you fix anything else.

8. HNSW with a discipline around REINDEX

If you remember one thing from this post, remember this. Never run VACUUM FULL on a table with a pgvector index unless you are ready to REINDEX immediately after, and to verify the rebuild with a real query. Better still: do not run VACUUM FULL at all. For the rare cases you genuinely need to reclaim physical space, use pg_repack, which rebuilds tables without taking the access-exclusive lock.

CREATE INDEX CONCURRENTLY chunks_embedding_hnsw
  ON chunks USING hnsw (embedding vector_cosine_ops)
  WITH (m = 16, ef_construction = 64);

-- After any bulk delete or update:
REINDEX INDEX CONCURRENTLY chunks_embedding_hnsw;

Warning

A pgvector index pointing at stale tuple identifiers returns plausible answers. Plausible answers are worse than errors, because they reach customers before they reach you.

Patterns nine to eleven, worth a planning meeting

9. Tenant-scoped row-level security

Multi-tenant agent apps have a recurring nightmare: a prompt injection convinces the agent to read another tenant's data. Belt-and-braces means tenant isolation at the SQL layer as well as the application layer. Even if a model is talked into asking for the wrong rows, the database refuses.

ALTER TABLE messages ENABLE ROW LEVEL SECURITY;

CREATE POLICY messages_tenant ON messages
  USING (tenant_id = current_setting('app.tenant_id')::uuid);

Set the session variable at the start of every request. If the agent ever sends a query under the wrong tenant context, Postgres returns nothing, regardless of what the model thinks it is doing. Row-level security has a learning curve. The cost is mostly upfront, and the rewards are paid out in the meetings you do not have to attend after a leak.

10. Logical replication for the embedding worker

Re-embedding a thousand chunks every time someone edits a document is wasteful. Stream changes off the source table via a publication, and let a worker consume the change stream and update vectors out of band. The agent reads from the materialised table and never blocks on a model provider roundtrip.

The first time we shipped this, p95 retrieval latency fell from 740 ms to 90 ms during business hours, because the agent's read path no longer competed with the embedding worker for CPU. Use the pgoutput plugin and one publication per consumer. The wal_level needs to be logical, which is a postgresql.conf change and a restart, so schedule it during the lunchtime quiet window.

11. A canary chunk in every index

This is the pattern we owe to the Friday incident. Every embeddings table now contains a single deliberate chunk with a known phrase, for example "the canary phrase for index health is the colour copper". A cron job runs the canary query every minute and asserts that the top hit is, in fact, the canary. If it is not, page someone.

The canary works because it is a real row, indexed the same way as the rest of the corpus. A synthetic health-check that bypassed the index would not have caught our VACUUM FULL bug. It is the cheapest possible test that the index returns what the table contains. We did not have it. We do now.

The one thing to do today

Open psql, connect as the user your agent uses, and run SHOW statement_timeout. If the answer is 0, you are one runaway tool call away from a bill nobody asked for. Set it to two seconds. Restart the agent. The rest of the list can wait until Tuesday afternoon. You do not need to ship the eleven patterns in a quarter. You need to ship the first five this week. The rest become obvious once you stop firefighting and start sleeping.

When we rebuilt the retrieval layer for a Dutch e-commerce client last quarter, the canary pattern caught a misconfigured HNSW rebuild before a single customer saw a wrong answer. If any of this sounds familiar, our team builds AI agents on exactly this stack, and the runbook is already written.

Key takeaway

Ship the five quick Postgres patterns this week (timeout, idempotency, SKIP LOCKED, GIN, NOTIFY); the harder six get easier once you stop firefighting.

FAQ

Why is VACUUM FULL dangerous for pgvector indexes?

VACUUM FULL rewrites the table and rebuilds every index. HNSW graph construction is non-deterministic on near-duplicates, so the rebuilt index can rank stale or removed chunks ahead of live ones until you REINDEX and verify.

Can I use IVFFlat instead of HNSW to avoid this?

IVFFlat is cheaper to build but degrades on inserts and needs periodic rebuilding to stay accurate. HNSW is the better default for agent retrieval. The discipline around REINDEX still applies.

Do I need row-level security if my app already filters by tenant?

Yes. Application filters protect you from bugs you wrote. Row-level security protects you from prompt injections that talk your agent into writing different SQL. They cover different threats.

How small can the canary chunk be?

One row with a unique phrase and the same embedding model as the rest of the corpus. The point is that it travels the same code path as a real query, so any breakage in the index path catches it.

ai agentsragarchitecturetoolingoperations

Building something?

Start a project