NotificationsYou must be signed in to change notification settings
Fork5
Star26

Commit5e550ac

committed

Document Warm Standby for High Availability

Includes sample standby script.Simon Riggs

1 parent075c0ca commit5e550acCopy full SHA for 5e550ac

File tree

1 file changed

+307

-1

lines changed

doc/src/sgml
- backup.sgml

1 file changed

+307

-1

lines changed

`‎doc/src/sgml/backup.sgml`

Lines changed: 307 additions & 1 deletion

Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-<!-- $PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.84 2006/09/1521:55:07 momjian Exp $ -->`
	`1`	`+<!-- $PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.85 2006/09/1522:02:21 momjian Exp $ -->`
`2`	`2`
`3`	`3`	`<chapter id="backup">`
`4`	`4`	`<title>Backup and Restore</title>`
`@@ -1203,6 +1203,312 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows`
`1203`	`1203`	`</sect2>`
`1204`	`1204`	`</sect1>`
`1205`	`1205`
	`1206`	`+ <sect1 id="warm-standby">`
	`1207`	`+ <title>Warm Standby Servers for High Availability</title>`
	`1208`	`+`
	`1209`	`+ <indexterm zone="backup">`
	`1210`	`+ <primary>Warm Standby</primary>`
	`1211`	`+ </indexterm>`
	`1212`	`+`
	`1213`	`+ <indexterm zone="backup">`
	`1214`	`+ <primary>PITR Standby</primary>`
	`1215`	`+ </indexterm>`
	`1216`	`+`
	`1217`	`+ <indexterm zone="backup">`
	`1218`	`+ <primary>Standby Server</primary>`
	`1219`	`+ </indexterm>`
	`1220`	`+`
	`1221`	`+ <indexterm zone="backup">`
	`1222`	`+ <primary>Log Shipping</primary>`
	`1223`	`+ </indexterm>`
	`1224`	`+`
	`1225`	`+ <indexterm zone="backup">`
	`1226`	`+ <primary>Witness Server</primary>`
	`1227`	`+ </indexterm>`
	`1228`	`+`
	`1229`	`+ <indexterm zone="backup">`
	`1230`	`+ <primary>STONITH</primary>`
	`1231`	`+ </indexterm>`
	`1232`	`+`
	`1233`	`+ <indexterm zone="backup">`
	`1234`	`+ <primary>High Availability</primary>`
	`1235`	`+ </indexterm>`
	`1236`	`+`
	`1237`	`+ <para>`
	`1238`	`+ Continuous Archiving can be used to create a High Availability (HA)`
	`1239`	`+ cluster configuration with one or more Standby Servers ready to take`
	`1240`	`+ over operations in the case that the Primary Server fails. This`
	`1241`	`+ capability is more widely known as Warm Standby Log Shipping.`
	`1242`	`+ </para>`
	`1243`	`+`
	`1244`	`+ <para>`
	`1245`	`+ The Primary and Standby Server work together to provide this capability,`
	`1246`	`+ though the servers are only loosely coupled. The Primary Server operates`
	`1247`	`+ in Continuous Archiving mode, while the Standby Server operates in a`
	`1248`	`+ continuous Recovery mode, reading the WAL files from the Primary. No`
	`1249`	`+ changes to the database tables are required to enable this capability,`
	`1250`	`+ so it offers a low administration overhead in comparison with other`
	`1251`	`+ replication approaches. This configuration also has a very low`
	`1252`	`+ performance impact on the Primary server.`
	`1253`	`+ </para>`
	`1254`	`+`
	`1255`	`+ <para>`
	`1256`	`+ Directly moving WAL or "log" records from one database server to another`
	`1257`	`+ is typically described as Log Shipping. PostgreSQL implements file-based`
	`1258`	`+ Log Shipping, meaning WAL records are batched one file at a time. WAL`
	`1259`	`+ files can be shipped easily and cheaply over any distance, whether it be`
	`1260`	`+ to an adjacent system, another system on the same site or another system`
	`1261`	`+ on the far side of the globe. The bandwidth required for this technique`
	`1262`	`+ varies according to the transaction rate of the Primary Server.`
	`1263`	`+ Record-based Log Shipping is also possible with custom-developed`
	`1264`	`+ procedures, discussed in a later section. Future developments are likely`
	`1265`	`+ to include options for synchronous and/or integrated record-based log`
	`1266`	`+ shipping.`
	`1267`	`+ </para>`
	`1268`	`+`
	`1269`	`+ <para>`
	`1270`	`+ It should be noted that the log shipping is asynchronous, i.e. the WAL`
	`1271`	`+ records are shipped after transaction commit. As a result there can be a`
	`1272`	`+ small window of data loss, should the Primary Server suffer a`
	`1273`	`+ catastrophic failure. The window of data loss is minimised by the use of`
	`1274`	`+ the archive_timeout parameter, which can be set as low as a few seconds`
	`1275`	`+ if required. A very low setting can increase the bandwidth requirements`
	`1276`	`+ for file shipping.`
	`1277`	`+ </para>`
	`1278`	`+`
	`1279`	`+ <para>`
	`1280`	`+ The Standby server is not available for access, since it is continually`
	`1281`	`+ performing recovery processing. Recovery performance is sufficiently`
	`1282`	`+ good that the Standby will typically be only minutes away from full`
	`1283`	`+ availability once it has been activated. As a result, we refer to this`
	`1284`	`+ capability as a Warm Standby configuration that offers High`
	`1285`	`+ Availability. Restoring a server from an archived base backup and`
	`1286`	`+ rollforward can take considerably longer and so that technique only`
	`1287`	`+ really offers a solution for Disaster Recovery, not HA.`
	`1288`	`+ </para>`
	`1289`	`+`
	`1290`	`+ <para>`
	`1291`	`+ Other mechanisms for High Availability replication are available, both`
	`1292`	`+ commercially and as open-source software.`
	`1293`	`+ </para>`
	`1294`	`+`
	`1295`	`+ <para>`
	`1296`	`+ In general, log shipping between servers running different release`
	`1297`	`+ levels will not be possible. It is the policy of the PostgreSQL Worldwide`
	`1298`	`+ Development Group not to make changes to disk formats during minor release`
	`1299`	`+ upgrades, so it is likely that running different minor release levels`
	`1300`	`+ on Primary and Standby servers will work successfully. However, no`
	`1301`	`+ formal support for that is offered and you are advised not to allow this`
	`1302`	`+ to occur over long periods.`
	`1303`	`+ </para>`
	`1304`	`+`
	`1305`	`+ <sect2 id="warm-standby-planning">`
	`1306`	`+ <title>Planning</title>`
	`1307`	`+`
	`1308`	`+ <para>`
	`1309`	`+ On the Standby server all tablespaces and paths will refer to similarly`
	`1310`	`+ named mount points, so it is important to create the Primary and Standby`
	`1311`	`+ servers so that they are as similar as possible, at least from the`
	`1312`	`+ perspective of the database server. Furthermore, any CREATE TABLESPACE`
	`1313`	`+ commands will be passed across as-is, so any new mount points must be`
	`1314`	`+ created on both servers before they are used on the Primary. Hardware`
	`1315`	`+ need not be the same, but experience shows that maintaining two`
	`1316`	`+ identical systems is easier than maintaining two dissimilar ones over`
	`1317`	`+ the whole lifetime of the application and system.`
	`1318`	`+ </para>`
	`1319`	`+`
	`1320`	`+ <para>`
	`1321`	`+ There is no special mode required to enable a Standby server. The`
	`1322`	`+ operations that occur on both Primary and Standby servers are entirely`
	`1323`	`+ normal continuous archiving and recovery tasks. The primary point of`
	`1324`	`+ contact between the two database servers is the archive of WAL files`
	`1325`	`+ that both share: Primary writing to the archive, Standby reading from`
	`1326`	`+ the archive. Care must be taken to ensure that WAL archives for separate`
	`1327`	`+ servers do not become mixed together or confused.`
	`1328`	`+ </para>`
	`1329`	`+`
	`1330`	`+ <para>`
	`1331`	`+ The magic that makes the two loosely coupled servers work together is`
	`1332`	`+ simply a restore_command that waits for the next WAL file to be archived`
	`1333`	`+ from the Primary. The restore_command is specified in the recovery.conf`
	`1334`	`+ file on the Standby Server. Normal recovery processing would request a`
	`1335`	`+ file from the WAL archive, causing an error if the file was unavailable.`
	`1336`	`+ For Standby processing it is normal for the next file to be unavailable,`
	`1337`	`+ so we must be patient and wait for it to appear. A waiting`
	`1338`	`+ restore_command can be written as a custom script that loops after`
	`1339`	`+ polling for the existence of the next WAL file. There must also be some`
	`1340`	`+ way to trigger failover, which should interrupt the restore_command,`
	`1341`	`+ break the loop and return a file not found error to the Standby Server.`
	`1342`	`+ This then ends recovery and the Standby will then come up as a normal`
	`1343`	`+ server.`
	`1344`	`+ </para>`
	`1345`	`+`
	`1346`	`+ <para>`
	`1347`	`+ Sample code for the C version of the restore_command would be be:`
	`1348`	`+<programlisting>`
	`1349`	`+triggered = false;`
	`1350`	`+while (!NextWALFileReady() && !triggered)`
	`1351`	`+{`
	`1352`	`+ sleep(100000L); // wait for ~0.1 sec`
	`1353`	`+ if (CheckForExternalTrigger())`
	`1354`	`+ triggered = true;`
	`1355`	`+}`
	`1356`	`+if (!triggered)`
	`1357`	`+ CopyWALFileForRecovery();`
	`1358`	`+</programlisting>`
	`1359`	`+ </para>`
	`1360`	`+`
	`1361`	`+ <para>`
	`1362`	`+ PostgreSQL does not provide the system software required to identify a`
	`1363`	`+ failure on the Primary and notify the Standby system and then the`
	`1364`	`+ Standby database server. Many such tools exist and are well integrated`
	`1365`	`+ with other aspects of a system failover, such as ip address migration.`
	`1366`	`+ </para>`
	`1367`	`+`
	`1368`	`+ <para>`
	`1369`	`+ Triggering failover is an important part of planning and design. The`
	`1370`	`+ restore_command is executed in full once for each WAL file. The process`
	`1371`	`+ running the restore_command is therefore created and dies for each file,`
	`1372`	`+ so there is no daemon or server process and so we cannot use signals and`
	`1373`	`+ a signal handler. A more permanent notification is required to trigger`
	`1374`	`+ the failover. It is possible to use a simple timeout facility,`
	`1375`	`+ especially if used in conjunction with a known archive_timeout setting`
	`1376`	`+ on the Primary. This is somewhat error prone since a network or busy`
	`1377`	`+ Primary server might be sufficient to initiate failover. A notification`
	`1378`	`+ mechanism such as the explicit creation of a trigger file is less error`
	`1379`	`+ prone, if this can be arranged.`
	`1380`	`+ </para>`
	`1381`	`+ </sect2>`
	`1382`	`+`
	`1383`	`+ <sect2 id="warm-standby-config">`
	`1384`	`+ <title>Implementation</title>`
	`1385`	`+`
	`1386`	`+ <para>`
	`1387`	`+ The short procedure for configuring a Standby Server is as follows. For`
	`1388`	`+ full details of each step, refer to previous sections as noted.`
	`1389`	`+ <orderedlist>`
	`1390`	`+ <listitem>`
	`1391`	`+ <para>`
	`1392`	`+ Set up Primary and Standby systems as near identically as possible,`
	`1393`	`+ including two identical copies of PostgreSQL at same release level.`
	`1394`	`+ </para>`
	`1395`	`+ </listitem>`
	`1396`	`+ <listitem>`
	`1397`	`+ <para>`
	`1398`	`+ Set up Continuous Archiving from the Primary to a WAL archive located`
	`1399`	`+ in a directory on the Standby Server. Ensure that both <xref`
	`1400`	`+ linkend="guc-archive-command"> and <xref linkend="guc-archive-timeout">`
	`1401`	`+ are set. (See <xref linkend="backup-archiving-wal">)`
	`1402`	`+ </para>`
	`1403`	`+ </listitem>`
	`1404`	`+ <listitem>`
	`1405`	`+ <para>`
	`1406`	`+ Make a Base Backup of the Primary Server. (See <xref`
	`1407`	`+ linkend="backup-base-backup">)`
	`1408`	`+ </para>`
	`1409`	`+ </listitem>`
	`1410`	`+ <listitem>`
	`1411`	`+ <para>`
	`1412`	`+ Begin recovery on the Standby Server from the local WAL archive,`
	`1413`	`+ using a recovery.conf that specifies a restore_command that waits as`
	`1414`	`+ described previously. (See <xref linkend="backup-pitr-recovery">)`
	`1415`	`+ </para>`
	`1416`	`+ </listitem>`
	`1417`	`+ </orderedlist>`
	`1418`	`+ </para>`
	`1419`	`+`
	`1420`	`+ <para>`
	`1421`	`+ Recovery treats the WAL Archive as read-only, so once a WAL file has`
	`1422`	`+ been copied to the Standby system it can be copied to tape at the same`
	`1423`	`+ time as it is being used by the Standby database server to recover.`
	`1424`	`+ Thus, running a Standby Server for High Availability can be performed at`
	`1425`	`+ the same time as files are stored for longer term Disaster Recovery`
	`1426`	`+ purposes.`
	`1427`	`+ </para>`
	`1428`	`+`
	`1429`	`+ <para>`
	`1430`	`+ For testing purposes, it is possible to run both Primary and Standby`
	`1431`	`+ servers on the same system. This does not provide any worthwhile`
	`1432`	`+ improvement on server robustness, nor would it be described as HA.`
	`1433`	`+ </para>`
	`1434`	`+ </sect2>`
	`1435`	`+`
	`1436`	`+ <sect2 id="warm-standby-failover">`
	`1437`	`+ <title>Failover</title>`
	`1438`	`+`
	`1439`	`+ <para>`
	`1440`	`+ If the Primary Server fails then the Standby Server should take begin`
	`1441`	`+ failover procedures.`
	`1442`	`+ </para>`
	`1443`	`+`
	`1444`	`+ <para>`
	`1445`	`+ If the Standby Server fails then no failover need take place. If the`
	`1446`	`+ Standby Server can be restarted, then the recovery process can also be`
	`1447`	`+ immediately restarted, taking advantage of Restartable Recovery.`
	`1448`	`+ </para>`
	`1449`	`+`
	`1450`	`+ <para>`
	`1451`	`+ If the Primary Server fails and then immediately restarts, you must have`
	`1452`	`+ a mechanism for informing it that it is no longer the Primary. This is`
	`1453`	`+ sometimes known as STONITH (Should the Other Node In The Head), which is`
	`1454`	`+ necessary to avoid situations where both systems think they are the`
	`1455`	`+ Primary, which can lead to confusion and ultimately data loss.`
	`1456`	`+ </para>`
	`1457`	`+`
	`1458`	`+ <para>`
	`1459`	`+ Many failover systems use just two systems, the Primary and the Standby,`
	`1460`	`+ connected by some kind of heartbeat mechanism to continually verify the`
	`1461`	`+ connectivity between the two and the viability of the Primary. It is`
	`1462`	`+ also possible to use a third system, known as a Witness Server to avoid`
	`1463`	`+ some problems of inappropriate failover, but the additional complexity`
	`1464`	`+ may not be worthwhile unless it is set-up with sufficient care and`
	`1465`	`+ rigorous testing.`
	`1466`	`+ </para>`
	`1467`	`+`
	`1468`	`+ <para>`
	`1469`	`+ At the instant that failover takes place to the Standby, we have only a`
	`1470`	`+ single server in operation. This is known as a degenerate state.`
	`1471`	`+ The former Standby is now the Primary, but the former Primary is down`
	`1472`	`+ and may stay down. We must now fully re-create a Standby server,`
	`1473`	`+ either on the former Primary system when it comes up, or on a third,`
	`1474`	`+ possibly new, system. Once complete the Primary and Standby can be`
	`1475`	`+ considered to have switched roles. Some people choose to use a third`
	`1476`	`+ server to provide additional protection across the failover interval,`
	`1477`	`+ though clearly this complicates the system configuration and`
	`1478`	`+ operational processes (and this can also act as a Witness Server).`
	`1479`	`+ </para>`
	`1480`	`+`
	`1481`	`+ <para>`
	`1482`	`+ So, switching from Primary to Standby Server can be fast, but requires`
	`1483`	`+ some time to re-prepare the failover cluster. Regular switching from`
	`1484`	`+ Primary to Standby is encouraged, since it allows the regular downtime`
	`1485`	`+ one each system required to maintain HA. This also acts as a test of the`
	`1486`	`+ failover so that it definitely works when you really need it. Written`
	`1487`	`+ administration procedures are advised.`
	`1488`	`+ </para>`
	`1489`	`+ </sect2>`
	`1490`	`+`
	`1491`	`+ <sect2 id="warm-standby-record">`
	`1492`	`+ <title>Implementing Record-based Log Shipping</title>`
	`1493`	`+`
	`1494`	`+ <para>`
	`1495`	`+ The main features for Log Shipping in this release are based around the`
	`1496`	`+ file-based Log Shipping described above. It is also possible to`
	`1497`	`+ implement record-based Log Shipping using the pg_xlogfile_name_offset()`
	`1498`	`+ function, though this requires custom development.`
	`1499`	`+ </para>`
	`1500`	`+`
	`1501`	`+ <para>`
	`1502`	`+ An external program can call pg_xlogfile_name_offset() to find out the`
	`1503`	`+ filename and the exact byte offset within it of the latest WAL pointer.`
	`1504`	`+ If the external program regularly polls the server it can find out how`
	`1505`	`+ far forward the pointer has moved. It can then access the WAL file`
	`1506`	`+ directly and copy those bytes across to a less up-to-date copy on a`
	`1507`	`+ Standby Server.`
	`1508`	`+ </para>`
	`1509`	`+ </sect2>`
	`1510`	`+ </sect1>`
	`1511`	`+`
`1206`	`1512`	`<sect1 id="migration">`
`1207`	`1513`	`<title>Migration Between Releases</title>`
`1208`	`1514`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit5e550ac

File tree

1 file changed

1 file changed

`‎doc/src/sgml/backup.sgml`

0 commit comments