Overcome transaction ID (TXID) wraparound protection

This page describes what you can do when your database runs into TransactionID Wraparound protection in PostgreSQL. It manifests as anERROR message, asfollows:

databaseisnotacceptingcommandstoavoidwraparounddatalossindatabasedbname.Stopthepostmasterandvacuumthatdatabaseinsingle-usermode.Youmightalsoneedtocommitorrollbackoldpreparedtransactions,ordropstalereplicationslots.

Alternatively, aWARNING message as follows might appear:

databasedbnamemustbevacuumedwithin10985967transactions.Toavoidadatabaseshutdown,executeadatabase-wideVACUUMinthatdatabase.

Important: The error message "Stop the postmaster and vacuum that database in single-user mode." is an outdated error from before PostgreSQL v.8.3 when this was actually true. In most cases, you don't have to switch to single-user mode. Instead, you can run the requiredVACUUM commands and perform tuning forVACUUM to run fast.

By default, you can't run any data manipulation language (DML), but you can still runVACUUM.

### Flag `cloudsql.enable_maintenance_mode`

By default, you can't run any DDL or DML once the database is 'in wraparound' but Cloud SQL for PostgreSQL offers an emergency maintenance mode which gives you 500,000 more transaction IDs to use to get out of the situation without the need to restart the server in single-user mode.

To activate maintenance mode, setcloudsql.enable_maintenance_mode toon in the database session before using any commands requiring a transaction ID.

SET cloudsql.enable_maintenance_mode = 'on';[ your SQL, DML or DDL commands here ]

This needs to be done in the session where you're performing the maintenance before you do anything requiring a transaction ID.

Overview of steps

Find out which database and which tables are causing the wraparound.
Check if there's anything holding back (AUTO)VACUUM(for example, a stuck transaction ID).
Measure the speed of AUTOVACUUM. If it is slow, optionally, you can try to speed it up.
If needed, run a few more VACUUM commands manually.
Investigate other ways to speed up the vacuum. Sometimes the fastest way is to drop the table or some indexes.

Many of the recommendations for values of flags are purposefully not exact becausethey depend on many database parameters. Read the documents linked at the end ofthis page for a deeper analysis on this topic.

Find the database and table causing the wraparound

Finding the database

To find out which database or databases contain the tables that are causing the wraparound, run the following query:

SELECTdatname,age(datfrozenxid),2^31-1000000-age(datfrozenxid)asremainingFROMpg_databaseORDERBY3

The database with theremaining value close to 0 is the one causing the problem.

Finding the table

Connect to that database and run the following query:

SELECTc.relnamespace::regnamespaceasschema_name,c.relnameastable_name,greatest(age(c.relfrozenxid),age(t.relfrozenxid))asage,2^31-1000000-greatest(age(c.relfrozenxid),age(t.relfrozenxid))asremainingFROMpg_classcLEFTJOINpg_classtONc.reltoastrelid=t.oidWHEREc.relkindIN('r','m')ORDERBY4;

This query returns the table or tables causing the problem.

For TEMPORARY tables

If theschema_name starts withpg_temp_, then the only way to resolve the problem is to drop the table because PostgreSQL doesn't let you VACUUM temporary tables created in other sessions. Sometimes if that session is open and accessible, you can vacuum the table there, but this is often not the case.Use the following SQL statements to drop the temp table:

SETcloudsql.enable_maintenance_mode='on';/* get extra transaction ids */DROPTABLEpg_temp_<N>.<tablename>;

If this was the only blocker, then in about a minute, the autovacuum picks up this change and moves thedatfrozenxid forward inpg_database. This resolvesthe wraparound protection read-only state.

Normal tables

For normal (that is non-temporary) tables, continue with the next steps below here to see if anything is blocking the clean-up, if the VACUUM is running fast enough, and it the most important table is being vacuumed.

Check for a stuck transaction ID

One possible reason why the system can run out of transaction IDs is thatPostgreSQL can'tfreeze (that is, mark as visible to all transactions)any transaction IDs created after the oldest currently running transactionstarted. This is because of multiversion concurrency control (MVCC) rules. In extremecases, such transactions can become so old that they make it impossible forVACUUM to clean up any old transactions for the entire 2 billiontransaction ID wraparound limit and cause the whole system to stop accepting newDML. You typically also see warnings in the log file, sayingWARNING: oldestxmin is far in the past.

You should move on to optimization only after the stuck transaction IDhas been remediated.

Here are four potential reasons why there might be a stuck transaction ID, withinformation on how to mitigate each of them:

Long running transactions: Identify them and cancel or terminate the backend to unblock the vacuum.
Orphaned prepare transactions: Roll back these transactions.
Abandoned replication slots: Drop the abandoned slots.
Long running transaction on replica, withhot_standby_feedback = on: Identify them and cancel or terminate the backend to unblock the vacuum.

For these scenarios, the following query returns the age of the oldesttransaction and the number of transactions left until wraparound:

WITHqAS(SELECT(SELECTmax(age(backend_xmin))FROMpg_stat_activityWHEREstate!='idle')ASoldest_running_xact_age,(SELECTmax(age(transaction))FROMpg_prepared_xacts)ASoldest_prepared_xact_age,(SELECTmax(greatest(age(catalog_xmin),age(xmin)))FROMpg_replication_slots)ASoldest_replication_slot_age,(SELECTmax(age(backend_xmin))FROMpg_stat_replication)ASoldest_replica_xact_age)SELECT*,2^31-oldest_running_xact_ageASoldest_running_xact_left,2^31-oldest_prepared_xact_ageASoldest_prepared_xact_left,2^31-oldest_replication_slot_ageASoldest_replication_slot_left,2^31-oldest_replica_xact_ageASoldest_replica_xact_leftFROMq;

This query might return any of the*_left values reported close to or less than1 million away from wraparound. This value is the wraparound protection limitwhen PostgreSQL stops accepting new write commands. In this case, see either Remove VACUUM blockers orTune VACUUM.

For example, the preceding query might return:

┌─[RECORD1]─────────────────┬────────────┐│oldest_running_xact_age│2146483655││oldest_prepared_xact_age│2146483655││oldest_replication_slot_age│¤││oldest_replica_xact_age│¤││oldest_running_xact_left│999993││oldest_prepared_xact_left│999993││oldest_replication_slot_left│¤││oldest_replica_xact_left│¤│└──────────────────────────────┴────────────┘

whereoldest_running_xact_left andoldest_prepared_xact_left are within the 1million wraparound protection limit. In this case, you must first remove the blockers for theVACUUM to be able to proceed.

Remove VACUUM blockers

Long-running transactions

In the preceding query, ifoldest_running_xact is equal tooldest_prepared_xact, then go to theOrphaned prepare transaction section, because thelatest runningvalue includes also the prepared transactions.

You might first need to run the following command as thepostgresuser:

GRANTpg_signal_backendTOpostgres;

If the offending transaction belongs to any of the system users (starting withcloudsql...), you can't cancel it directly. You must restart thedatabase to cancel it.

To identify a long-running query, and cancel or terminate it to unblock thevacuum, first select a few of the oldest queries. TheLIMIT 10 line helpsfit the result on the screen. You might need to repeat this after resolvingthe oldest running queries.

SELECTpid,age(backend_xid)ASage_in_xids,now()-xact_startASxact_age,now()-query_startASquery_age,state,queryFROMpg_stat_activityWHEREstate!='idle'ORDERBY2DESCLIMIT10;

Ifage_in_xids comes back asNULL, this means thetransaction has not been allocated a permanent transaction ID and can be safelyignored.

Cancel the queries where thexids_left_to_wraparound isapproaching 1M.

Ifstate isactive, then the query can be cancelledusingSELECT pg_cancel_backend(pid);. Otherwise, you needto terminate the whole connection usingSELECT pg_terminate_backend(pid);,wherepid is thepid from the previous query

Orphaned prepare transactions

Note: To commit a prepared transaction, you must be connected as the same userthat originally executed the transaction. See the fieldowner in thefollowing table.DB_NAME is the name of the database in yourpg_stat_progress_vacuum view.

List all prepared transactions:

DB_NAME=>SELECTage(transaction),*FROMpg_prepared_xacts;┌─[RECORD1]┬───────────────────────────────┐│age│2146483656││transaction│2455493932││gid│trx_id_pin││prepared│2021-03-0316:54:07.923158+00││owner│postgres││database│DB_NAME│└─────────────┴───────────────────────────────┘

Roll back the oldest orphaned prepared transaction(s) byusing thegid from the last query (in this case,trx_id_pin) asthe transaction ID:

ROLLBACK PREPAREDtrx_id_pin;

Alternatively, commit it:

COMMIT PREPAREDtrx_id_pin;

See the SQL ROLLBACK PREPAREDdocumentation for a full explanation.

Abandoned replication slots

In case the replication slot is abandoned because the existingreplica is either stopped, paused, or has some other issue, you candelete the replica fromgcloud or Google Cloud console.

First, check that the replica is not disabled as described inManaging read replicas.If the replica is disabled, enable it again. If the lag stillstays high, delete the replica,

The replication slots are visible in thepg_replication_slotssystem view.

The following query fetches the relevant info:

SELECT*,age(xmin)ASageFROMpg_replication_slots;┌─[RECORD1]────────┬─────────────────────────────────────────────────┐│slot_name│cloudsql_1_355b114b_9ff4_4bc3_ac88_6a80199bd738││plugin│¤││slot_type│physical││datoid│¤││database│¤││active│t││active_pid│1126││xmin│2453745071││catalog_xmin│¤││restart_lsn│C0/BEF7C2D0││confirmed_flush_lsn│¤││age│59│└─────────────────────┴─────────────────────────────────────────────────┘

In this example, thepg_replication_slots value is healthy (age == 59).If the age was near 2 billion, you would want to delete theslot. There is no easy way to know which replica is which in case the queryreturns multiple records. So, check them all in case there is a long-runningtransaction on any replica.

Long-running transactions on replicas

Check replicas for the oldest running transaction withhot_standby_feedbackset toon and disable it on the replica.

Thebackend_xmin column in thepg_stat_replicationview has the oldestTXID needed on the replica.

To move it forward, stop the query that holds it back on the replica. Todiscover which query is holding it back, use the query inLong running transactions, butthis time, run it on the replica.

Another option is to restart the replica.

Configure VACUUM

Set the following two flags:

autovacuum_vacuum_cost_delay = 0
autovacuum_work_mem = 1048576

The first disables any disk throttling for vacuuming by PostgreSQL so VACUUMcan run at full speed. By default, autovacuum is throttled so it does not use upall disk IO on the slowest servers.

The second flag,autovacuum_work_mem, decreases the number of indexcleanup passes. If possible, it should be large enough to store all IDs of deadrows in a table that VACUUM is going to clean up. When setting this value,consider that this is the maximum amount of local memory each running VACUUMcan allocate. Make sure that you're not allowing more than is available, withsome left in reserve. If you leave the database running in read-only mode, thenalso consider the local memory used for read-only queries.

On most systems, use the maximum value (1 GB or 1048576 kB, as shown in the sample).This value fits up to about 178 million dead tuples. Any more still causes multipleindex scan passes.

These and other flags are explained in more detail inOptimizing, monitoring, and troubleshooting VACUUM operations in PostgreSQL.

After setting these flags, restart the database so that autovacuum startswith the new values.

You can use thepg_stat_progress_vacuum view to monitor the progress ofautovacuum-started VACUUMs. This view shows VACUUMs running in alldatabases, and for tables (relations) from other databases that you can't look upthe table name using the view columnrelid.

To identify the databases and tables that need vacuuming next, use queries fromOptimizing, monitoring, and troubleshooting VACUUM operations in PostgreSQL. If the server VM is powerful enough and has the bandwidth for more parallel VACUUM processes than started by autovacuum, you can start some manual vacuums.

Check VACUUM speed

This section describes how to check VACUUM speed and how to accelerateit, if needed.

Check running autovacuums

All backends running VACUUM are visible in the system viewpg_stat_progress_vacuum.

If the current phase isscanning heap, then you can monitorprogress by watching changes in the columnheap_blks_scanned.Unfortunately, there is no easy way to determine scan speed in other phases.

Estimate the VACUUM scan speed

To estimate the scan speed, you need to first store the base values and thencalculate the change over time to estimate the completion time. First, you needto save a snapshot ofheap_blks_scanned together with a timestampby using the following snapshot query:

SELECTset_config('save.ts',clock_timestamp()::text,false),set_config('save.heap_blks_scanned',heap_blks_scanned::text,false)FROMpg_stat_progress_vacuumWHEREdatname='DB_NAME';

Since we can't save anything in tables that are already in wraparound, useset_config(flag, value) to set two user-defined flags -save.tsandsave.heap_blks_scanned - to the current values frompg_stat_progress_vacuum.

In the next query, we use these two as the comparison base to determine speed andestimate completion time.

NOTE:WHERE datname =DB_NAME restricts the investigation to onedatabase at a time. This number is enough if there is only one autovacuum running inthis database, with more than one row per database. Extra filter conditions('AND relid= …'') need to be added to WHERE to indicate a singleautovacuum row. This is also true for the next query.

Once you've saved the base values, you can run the following query:

withqas(SELECTdatname,phase,heap_blks_total,heap_blks_scanned,clock_timestamp()-current_setting('save.ts')::timestampASts_delta,heap_blks_scanned-current_setting('save.heap_blks_scanned')::bigintASscanned_deltaFROMpg_stat_progress_vacuumWHEREdatname=DB_NAME),q2AS(SELECT*,scanned_delta/extract('epoch'FROMts_delta)ASpages_per_secondFROMq)SELECT*,(heap_blks_total-heap_blks_scanned)/pages_per_secondASremaining_timeFROMq2;

┌─[RECORD1]──────┬──────────────────┐│datname│DB_NAME││phase│scanningheap││heap_blks_total│9497174││heap_blks_scanned│18016││ts_delta│00:00:40.30126││as_scanned_delta│11642││pages_per_second│288.87434288655││remaining_time│32814.1222418038│└───────────────────┴──────────────────┘

This query compares the current values to the save base values and calculatespages_per_second andremaining_time, which lets us decide ifVACUUM is running fast enough or if we want to speed it up. Theremaining_time value is only for thescanning heap phase.Other phases also take time, sometimes even more. You canread more on vacuuming and view blog posts on the internet discussing some of the complexaspects of vacuum.

Speed up VACUUM

The easiest and fastest way to make VACUUM scan faster is settingautovacuum_vacuum_cost_delay=0. This can be done from theGoogle Cloud console.

Unfortunately, the already running VACUUM does not pick up this value and youmight need to restart the database.

After a restart, you might see a result similar to the following:

┌─[RECORD1]──────┬──────────────────┐│datname│DB_NAME││phase│scanningheap││heap_blks_total│9497174││heap_blks_scanned│222382││ts_delta│00:00:21.422615││as_scanned_delta│138235││pages_per_second│6452.76031894332││remaining_time│1437.33713040171│└───────────────────┴──────────────────┘

In this sample, the speed increased from <300 pages/sec to ~6500 pages/sec, andthe expected remaining time for the heap scanning phase decreased from 9 hoursto 23 minutes.

The scan speed of the other phases is not as easy to measure, but they shouldshow a similar speedup.

Also consider makingautovacuum_work_mem as large as possible toavoid multiple passes over indexes. An index pass happens each time the memoryis filled with dead tuple pointers.

If the database is not being used otherwise, setautovacuum_work_memto have ~80% of memory free after allowing the required amount forshared_buffers.This is the upper limit for each of the autovacuum-started VACUUM processes. If youwant to continue running read-only workloads, use less memory.

Other ways to improve speed

Avoid vacuuming indexes

For huge tables, VACUUM spends most of the time cleaning up indexes.

PostgreSQL 14 has special optimizations for avoiding index cleanup if the system is in danger of wraparound.

In PostgreSQL 12 and 13, you can manually run the following statement:

VACUUM(INDEX_CLEANUPOFF,TRUNCATEOFF)<tablename>;

In versions 11 and older, you canDROP the index before running vacuum and recreate it later.

Dropping the index when an autovacuum is already running on that table requires cancelling the running vacuum and then immediately executing the drop index command before the autovacuum manages to start vacuum on that table again.

First, run the following statement to find the PID of the autovacuum process you need to terminate:

SELECTpid,queryFROMpg_stat_activityWHEREstate!='idle'ANDqueryilike'%vacuum%';

Then run the following statements to terminate the running vacuum and drop one or more indexes:

SETcloudsql.enable_maintenance_mode='on';/* get extra transaction ids */SELECTpg_terminate_backend(<pid>);DROPINDEX<index1>;DROPINDEX<index2>;...

Drop the offending table

In some rare cases, you can drop the table. For example, if it's a table that's easy to restore from another source like a backup or other database.

You still need to usecloudsql.enable_maintenance_mode = 'on' and likely also terminate the VACUUM on that table as shown in the previous section.

VACUUM FULL

In rare cases, it's faster to runVACUUM FULL FREEZE, usually when the table has only a small proportion of live tuples. This can be checked from thepg_stat_user_tables view (unless there has been a crash which has wiped out the statistics).

TheVACUUM FULL command copies live tuples to a new file, so enough space has to be available for the new file and its indexes.

What's next

Learn more aboutVACUUM for wraparound
Learn more aboutroutine vacuuming.
Learn more aboutautomatic vacuuming
Learn more aboutOptimizing, monitoring, and troubleshooting VACUUM operations in PostgreSQL

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-17 UTC.

Movatterモバイル変換

Overcome transaction ID (TXID) wraparound protection Stay organized with collections Save and categorize content based on your preferences.