pg_repack
pg_repack — utility andPostgres Pro Standard extension to reorganize tables
Synopsis
pg_repack
[option
...] [dbname
]
Description
pg_repack is aPostgres Pro Standard extension which lets you remove bloat from tables and indexes, and optionally restore the physical order of clustered indexes. UnlikeCLUSTER andVACUUM FULL
it works online, without holding an exclusive lock on the processed tables during processing.pg_repack is efficient to boot, with performance comparable to usingCLUSTER
directly.
pg_repack is a fork of the previoushttps://github.com/reorg/pg_reorg project.
You can choose one of the following methods to reorganize data:
Online
CLUSTER
(ordered by cluster index)Ordering by specified columns
Online
VACUUM FULL
(packing rows only)Rebuild or relocate only the indexes of a table
Note
Only superusers can use the utility.
Note
Target table must have a PRIMARY KEY, or at least a UNIQUE total index on a NOT NULL column.
Installation
On Linux systems,pg_repack is provided together withPostgres Pro as a separate pre-built package and requires thepostgrespro-std-11-server
package to be installed with all the dependencies. For the list of available packages and detailed installation instructions, seeChapter 16. On Windows systems,pg_repack is automatically installed as part ofPostgres Pro.
Once you havepg_repack installed, load thepg_repack
extension in the database you want to process, as follows:
$ psql -c "CREATE EXTENSION pg_repack" -dyour_database
You can later removepg_repack
from aPostgres Pro installation usingDROP EXTENSION pg_repack
.
If you are upgrading from a previous version ofpg_repack, just drop the old version from the database as explained above and install the new version.
Options
Reorganization Options
-a
--all
Attempt to repack all the databases of the cluster. Databases where the
pg_repack
extension is not installed will be skipped.-t
table
--table=
table
Reorganize the specified table(s) only. Multiple tables may be reorganized by writing multiple
-t
switches. By default, all eligible tables in the target databases are reorganized.-c
schema
--schema=
schema
Repack the tables in the specified schema(s) only. Multiple schemas may be repacked by writing multiple
-c
switches. Can be used in conjunction with--tablespace
to move tables to a different tablespace.-o
column
[, ...]--order-by=
column
[, ...]Perform an online
CLUSTER
ordered by the specified columns.-n
--no-order
Perform an online
VACUUM FULL
. Since version 1.2 this is the default for non-clustered tables.-N
--dry-run
Show what would be repacked and exit.
-j
num_jobs
--jobs=
num_jobs
Create the specified number of extra connections toPostgres Pro, and use these extra connections to parallelize the rebuild of indexes on each table. Parallel index builds are only supported for full-table repacks, not with
--index
or--only-indexes
options. If your server has extra cores and disk I/O available, this can be a useful way to speed uppg_repack.-s
tablespace
--tablespace=
tablespace
Move the repacked tables to the specified tablespace: essentially an online version of
ALTER TABLE ... SET TABLESPACE
. The tables' indexes are left in the original tablespace unless--moveidx
is specified too.-S
--moveidx
Also move the indexes of the repacked tables to the tablespace specified by the
--tablespace
option.-i
index
--index=
index
Repack the specified index(es) only. Multiple indexes may be repacked by writing multiple
-i
switches. May be used in conjunction with--tablespace
to move the index(es) to a different tablespace.-x
--only-indexes
Repack only the indexes of the specified table(s), which must be specified with the
--table
option.-T
secs
--wait-timeout=
secs
pg_repack needs to take an exclusive lock at the end of the reorganization. This setting controls how many secondspg_repack will wait to acquire this lock. If the lock cannot be taken after this duration,pg_repack will forcibly cancel the conflicting queries. If you are usingPostgres Pro orPostgreSQL version 8.4 or newer,pg_repack will fall back to using
pg_terminate_backend()
to disconnect any remaining backends after this timeout has passed twice. The default is 60 seconds.-Z
--no-analyze
Disable
ANALYZE
after a full-table reorganization. If not specified,ANALYZE
is executed after the reorganization.
Connection Options
[-d]
dbname
[--dbname=]
dbname
Specifies the name of the database to be reorganized. If this is not specified and
-a
(or--all
) is not used, the database name is read from the environment variablePGDATABASE
. If that is not set, the user name specified for the connection is used.-h
host
--host=
host
Specifies the host name of the machine on which the server is running. If the value begins with a slash, it is used as the directory for the Unix domain socket.
-p
port
--port=
port
Specifies the TCP port or local Unix domain socket file extension on which the server is listening for connections.
-U
username
--username=
username
User name to connect as.
-w
--no-password
Never issue a password prompt. If the server requires password authentication and a password is not available by other means such as a
.pgpass
file, the connection attempt will fail. This option can be useful in batch jobs and scripts where no user is present to enter a password.-W
--password
Forcepg_repack to prompt for a password before connecting to a database.
This option is never essential, sincepg_repack will automatically prompt for a password if the server demands password authentication. However,pg_repack will waste a connection attempt finding out that the server wants a password. In some cases it is worth typing
-W
to avoid the extra connection attempt.
Generic Options
-e
--echo
Echo the commands thatpg_repack generates and sends to the server.
-E
level
--elevel=
level
Choose the output message level from
DEBUG
,INFO
,NOTICE
,WARNING
,ERROR
,LOG
,FATAL
, andPANIC
. The default isINFO
.--help
Show help aboutpg_repack command line arguments, and exit.
-V
--version
Print thepg_repack version and exit.
Environment
PGDATABASE
PGHOST
PGPORT
PGUSER
Default connection parameters
This utility, like most otherPostgres Pro utilities, also uses the environment variables supported bylibpq (seeSection 32.14).
Examples
Perform an onlineCLUSTER
of all the clustered tables in the databasetest
, and perform an onlineVACUUM FULL
of all the non-clustered tables:
$ pg_repack test
Perform an onlineVACUUM FULL
on the tablesfoo
andbar
in the databasetest
(an eventual cluster index is ignored):
$ pg_repack --no-order --table foo --table bar test
Move all indexes of tablefoo
to tablespacetbs
:
$ pg_repack -d test --table foo --only-indexes --tablespace tbs
Move the specified index to tablespacetbs
:
$ pg_repack -d test --index idx --tablespace tbs
Diagnostics
Error messages are reported whenpg_repack fails. The following list shows the cause of errors.
You need to cleanup by hand after fatal errors. To cleanup, just removepg_repack
from the database and install it again.
ForPostgres Pro or PostgreSQL 9.1 and newer execute:
DROP EXTENSION pg_repack CASCADE
in the database where the error occurred, followed by
CREATE EXTENSION pg_repack
For previous versions load the script$SHAREDIR/contrib/uninstall_pg_repack.sql
into the database where the error occurred and then load$SHAREDIR/contrib/pg_repack.sql
again.
INFO: database "db" skipped:pg_repack VER is not installed in the database:pg_repack is not installed in the database when the --all option is specified.
Create thepg_repack
extension in the database.
ERROR: pg_repack VER is not installed in the database:pg_repack is not installed in the database specified by --dbname
Create thepg_repack
extension in the database.
ERROR: program 'pg_repack V1' does not match database library 'pg_repack V2':There is a mismatch between the pg_repack binary and the database library(.so or .dll).
The mismatch could be due to the wrong binary in thePATH
or the wrong database being addressed. Check the program directory and the database; if they are what expected you may need to repeatpg_repack
installation.
ERROR: extension 'pg_repack V1' required, found 'pg_repack V2':The SQL extension found in the database does not match the version required by the pg_repack program.
You should drop the extension from the database and reload it as described inthe section called “Installation”.
ERROR: relation "table" must have a primary key or not-null unique keys:The target table doesn't have a PRIMARY KEY or any UNIQUE constraints defined.
Define a PRIMARY KEY or a UNIQUE constraint on the table.
ERROR: query failed: ERROR: column "col" does not exist:The target table doesn't have columns specified by --order-by option.
Specify existing columns.
WARNING: the table "tbl" already has a trigger called z_repack_trigger:The trigger was probably installed during a previous attemptto run pg_repack on the table which was interruptedand for some reason failed to clean up the temporary objects.
You can remove all the temporary objects by dropping and re-creating the extension: seethe section called “Installation” for the details.
WARNING: trigger "trg" conflicting on table "tbl":The target table has a trigger whose name follows z_repack_triggerin alphabetical order.
Thez_repack_trigger
should be the last BEFORE trigger to fire. Please rename your trigger so that it sorts alphabetically beforepg_repack
one; you can use:
ALTER TRIGGERzzz_my_trigger
ONsometable
RENAME TOyyy_my_trigger
;
ERROR: Another pg_repack command may be running on the table. Please try again later.
There is a chance of deadlock when two concurrentpg_repack
commands are run on the same table. So, try to run the command after some time.
WARNING: Cannot create index "schema"."index_xxxxx", already existsDETAIL: An invalid index may have been left behind by a previouspg_repack on the table which was interrupted. Please use DROP INDEX"schema"."index_xxxxx" to remove this index and try again.
A temporary index apparently created bypg_repack has been left behind, and we do not want to risk dropping this index ourselves. If the index was in fact created by an oldpg_repack job which didn't get cleaned up, you should just useDROP INDEX
and try therepack
command again.
Restrictions
pg_repack comes with the following restrictions.
- Temporary tables
pg_repack cannot reorganize temporary tables.
- GiST indexes
pg_repack cannot cluster tables by
GiST
indexes.- DDL commands
You will not be able to perform DDL commands of the target table(s)except
VACUUM
orANALYZE
whilepg_repack is working.pg_repack will hold anACCESS SHARE
lock on the target table during a full-table repack, to enforce this restriction.If you are using version 1.1.8 or earlier, you must not attempt to perform any DDL commands on the target table(s) whilepg_repack is running. In many casespg_repack would fail and rollback correctly, but there were some cases in these earlier versions which could result in data corruption.