Movatterモバイル変換

25.1. SQL Dump
Prev	Up	Chapter 25. Backup and Restore	Home	Next

25.1. SQL Dump #

25.1.1. Restoring the Dump
25.1.2. Usingpg_dumpall
25.1.3. Handling Large Databases

The idea behind this dump method is to generate a file with SQL commands that, when fed back to the server, will recreate the database in the same state as it was at the time of the dump.PostgreSQL provides the utility programpg_dump for this purpose. The basic usage of this command is:

pg_dumpdbname >dumpfile

As you see,pg_dump writes its result to the standard output. We will see below how this can be useful. While the above command creates a text file,pg_dump can create files in other formats that allow for parallelism and more fine-grained control of object restoration.

pg_dump is a regularPostgreSQL client application (albeit a particularly clever one). This means that you can perform this backup procedure from any remote host that has access to the database. But remember thatpg_dump does not operate with special permissions. In particular, it must have read access to all tables that you want to back up, so in order to back up the entire database you almost always have to run it as a database superuser. (If you do not have sufficient privileges to back up the entire database, you can still back up portions of the database to which you do have access using options such as-nschema or-ttable.)

To specify which database serverpg_dump should contact, use the command line options-hhost and-pport. The default host is the local host or whatever yourPGHOST environment variable specifies. Similarly, the default port is indicated by thePGPORT environment variable or, failing that, by the compiled-in default. (Conveniently, the server will normally have the same compiled-in default.)

Like any otherPostgreSQL client application,pg_dump will by default connect with the database user name that is equal to the current operating system user name. To override this, either specify the-U option or set the environment variablePGUSER. Remember thatpg_dump connections are subject to the normal client authentication mechanisms (which are described inChapter 20).

An important advantage ofpg_dump over the other backup methods described later is thatpg_dump's output can generally be re-loaded into newer versions ofPostgreSQL, whereas file-level backups and continuous archiving are both extremely server-version-specific.pg_dump is also the only method that will work when transferring a database to a different machine architecture, such as going from a 32-bit to a 64-bit server.

Dumps created bypg_dump are internally consistent, meaning, the dump represents a snapshot of the database at the timepg_dump began running.pg_dump does not block other operations on the database while it is working. (Exceptions are those operations that need to operate with an exclusive lock, such as most forms ofALTER TABLE.)

25.1.1. Restoring the Dump#

Text files created bypg_dump are intended to be read by thepsql program using its default settings. The general command form to restore a text dump is

psql -Xdbname <dumpfile

wheredumpfile is the file output by thepg_dump command. The databasedbname will not be created by this command, so you must create it yourself fromtemplate0 before executingpsql (e.g., withcreatedb -T template0dbname). To ensurepsql runs with its default settings, use the-X (--no-psqlrc) option.psql supports options similar topg_dump for specifying the database server to connect to and the user name to use. See thepsql reference page for more information.

Non-text file dumps should be restored using thepg_restore utility.

Before restoring an SQL dump, all the users who own objects or were granted permissions on objects in the dumped database must already exist. If they do not, the restore will fail to recreate the objects with the original ownership and/or permissions. (Sometimes this is what you want, but usually it is not.)

By default, thepsql script will continue to execute after an SQL error is encountered. You might wish to runpsql with theON_ERROR_STOP variable set to alter that behavior and havepsql exit with an exit status of 3 if an SQL error occurs:

psql -X --set ON_ERROR_STOP=ondbname <dumpfile

Either way, you will only have a partially restored database. Alternatively, you can specify that the whole dump should be restored as a single transaction, so the restore is either fully completed or fully rolled back. This mode can be specified by passing the-1 or--single-transaction command-line options topsql. When using this mode, be aware that even a minor error can rollback a restore that has already run for many hours. However, that might still be preferable to manually cleaning up a complex database after a partially restored dump.

The ability ofpg_dump andpsql to write to or read from pipes makes it possible to dump a database directly from one server to another, for example:

pg_dump -hhost1dbname | psql -X -hhost2dbname

Important

The dumps produced bypg_dump are relative totemplate0. This means that any languages, procedures, etc. added viatemplate1 will also be dumped bypg_dump. As a result, when restoring, if you are using a customizedtemplate1, you must create the empty database fromtemplate0, as in the example above.

After restoring a backup, it is wise to runANALYZE on each database so the query optimizer has useful statistics; seeSection 24.1.3 andSection 24.1.6 for more information. For more advice on how to load large amounts of data intoPostgreSQL efficiently, refer toSection 14.4.

25.1.2. Usingpg_dumpall#

pg_dump dumps only a single database at a time, and it does not dump information about roles or tablespaces (because those are cluster-wide rather than per-database). To support convenient dumping of the entire contents of a database cluster, thepg_dumpall program is provided.pg_dumpall backs up each database in a given cluster, and also preserves cluster-wide data such as role and tablespace definitions. The basic usage of this command is:

pg_dumpall >dumpfile

The resulting dump can be restored withpsql:

psql -X -fdumpfile postgres

(Actually, you can specify any existing database name to start from, but if you are loading into an empty cluster thenpostgres should usually be used.) It is always necessary to have database superuser access when restoring apg_dumpall dump, as that is required to restore the role and tablespace information. If you use tablespaces, make sure that the tablespace paths in the dump are appropriate for the new installation.

pg_dumpall works by emitting commands to re-create roles, tablespaces, and empty databases, then invokingpg_dump for each database. This means that while each database will be internally consistent, the snapshots of different databases are not synchronized.

Cluster-wide data can be dumped alone using thepg_dumpall--globals-only option. This is necessary to fully backup the cluster if running thepg_dump command on individual databases.

25.1.3. Handling Large Databases#

Some operating systems have maximum file size limits that cause problems when creating largepg_dump output files. Fortunately,pg_dump can write to the standard output, so you can use standard Unix tools to work around this potential problem. There are several possible methods:

Use compressed dumps. You can use your favorite compression program, for examplegzip:

pg_dumpdbname | gzip >filename.gz

Reload with:

gunzip -cfilename.gz | psqldbname

or:

catfilename.gz | gunzip | psqldbname

Usesplit. Thesplit command allows you to split the output into smaller files that are acceptable in size to the underlying file system. For example, to make 2 gigabyte chunks:

pg_dumpdbname | split -b 2G -filename

Reload with:

catfilename* | psqldbname

If using GNUsplit, it is possible to use it andgzip together:

pg_dumpdbname | split -b 2G --filter='gzip > $FILE.gz'

It can be restored usingzcat.

Usepg_dump's custom dump format. IfPostgreSQL was built on a system with thezlib compression library installed, the custom dump format will compress data as it writes it to the output file. This will produce dump file sizes similar to usinggzip, but it has the added advantage that tables can be restored selectively. The following command dumps a database using the custom dump format:

pg_dump -Fcdbname >filename

A custom-format dump is not a script forpsql, but instead must be restored withpg_restore, for example:

pg_restore -ddbnamefilename

See thepg_dump andpg_restore reference pages for details.

For very large databases, you might need to combinesplit with one of the other two approaches.

Usepg_dump's parallel dump feature. To speed up the dump of a large database, you can usepg_dump's parallel mode. This will dump multiple tables at the same time. You can control the degree of parallelism with the-j parameter. Parallel dumps are only supported for the "directory" archive format.

pg_dump -jnum -F d -fout.dirdbname

You can usepg_restore -j to restore a dump in parallel. This will work for any archive of either the "custom" or the "directory" archive mode, whether or not it has been created withpg_dump -j.

Prev	Up	Next
Chapter 25. Backup and Restore	Home	25.2. File System Level Backup

epub pdf