
Series Summary: This is Part 1 of a multi-part series on PostgreSQL autovacuum failures.
PostgreSQL is known for its resilience and smart storage mechanisms, but even a mature system has edge cases. This post kicks off a multi-part series exploring a subtle but dangerous gotcha: under specific conditions, autovacuum can stop running entirely in a multi-database cluster, without any clear warning. Let’s look at how this happens and why it matters.
Understanding PostgreSQL VACUUM Basics
If you’ve worked with PostgreSQL for any length of time, you’re familiar with the VACUUM process – what it does and how essential it is for database health. Let’s quickly review its core responsibilities to set the stage:
- Reclaims storage: Scans tables to mark obsolete tuple (row) versions created by UPDATE and DELETE commands as reusable space.
- Maintains the Free Space map (FSM): This map helps PostgreSQL quickly locate pages with enough space for INSERT and UPDATE operations.
- Learn more in the PostgreSQL FSM documentation or review the backend FSM README.
- Prevents transaction ID (XID) wraparound: Updates the XID in tuple headers to a lower value to ensure visibility rules remain accurate.
- Trims bloated tables: Whenever possible, VACUUM removes empty data pages at the end of the table heap.
The Hidden Danger: Autovacuum Stops Working Across Databases
Let’s talk about the "gotcha."
PostgreSQL's autovacuum launcher operates at the cluster level, spawning autovacuum workers that target one database at a time. If a database has an open session with temporary tables that aren't cleaned up, especially if those tables live long enough to prevent XID advancement. It can block the entire autovacuum process for all other databases in the cluster.
Yes, really. PostgreSQL will stop processing autovacuum tasks for other databases—even when the emergency vacuum thresholds (e.g., for XID wraparound) are breached.
This behavior emerges under these specific conditions:
- You’re running multiple databases in a single PostgreSQL cluster.
- One or more of those databases use session-specific temporary tables.
- A session creates a temp table and leaves it open and uncommitted for a long time (often due to application logic or connection pooling issues).
When that happens, the autovacuum launcher prioritizes the database with the open temp table, often getting stuck trying to vacuum that database repeatedly, while neglecting all others. This silent failure can allow bloat to accumulate or XID limits to creep dangerously close to shutdown thresholds.
In Part 2, we’ll walk through a test scenario that reproduces this failure mode step-by-step. You’ll see how PostgreSQL can quickly enter an unhealthy state, even when autovacuum should be working.
Need Help?
Command Prompt is the world’s oldest dedicated Postgres services and consulting company, offering expert support for performance optimization and troubleshooting. Contact us today.
Can't wait for the full series? Receive a link to download the full series as a white paper by submitting your contact information below.
This whitepaper examines PostgreSQL autovacuum failures, focusing on temp tables and multi-database edge cases that cause bloat, slowdowns, or XID wraparound risks in complex environments.
Thank you for your interest. Schedule a call with our Founder: https://calendar.app.google/wXgXkHoiFxHwW7KA6