@@ -875,6 +875,209 @@ primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass'
875875 </sect3>
876876
877877 </sect2>
878+ <sect2 id="synchronous-replication">
879+ <title>Synchronous Replication</title>
880+
881+ <indexterm zone="high-availability">
882+ <primary>Synchronous Replication</primary>
883+ </indexterm>
884+
885+ <para>
886+ <productname>PostgreSQL</> streaming replication is asynchronous by
887+ default. If the primary server
888+ crashes then some transactions that were committed may not have been
889+ replicated to the standby server, causing data loss. The amount
890+ of data loss is proportional to the replication delay at the time of
891+ failover.
892+ </para>
893+
894+ <para>
895+ Synchronous replication offers the ability to confirm that all changes
896+ made by a transaction have been transferred to one synchronous standby
897+ server. This extends the standard level of durability
898+ offered by a transaction commit. This level of protection is referred
899+ to as 2-safe replication in computer science theory.
900+ </para>
901+
902+ <para>
903+ When requesting synchronous replication, each commit of a
904+ write transaction will wait until confirmation is
905+ received that the commit has been written to the transaction log on disk
906+ of both the primary and standby server. The only possibility that data
907+ can be lost is if both the primary and the standby suffer crashes at the
908+ same time. This can provide a much higher level of durability, though only
909+ if the sysadmin is cautious about the placement and management of the two
910+ servers. Waiting for confirmation increases the user's confidence that the
911+ changes will not be lost in the event of server crashes but it also
912+ necessarily increases the response time for the requesting transaction.
913+ The minimum wait time is the roundtrip time between primary to standby.
914+ </para>
915+
916+ <para>
917+ Read only transactions and transaction rollbacks need not wait for
918+ replies from standby servers. Subtransaction commits do not wait for
919+ responses from standby servers, only top-level commits. Long
920+ running actions such as data loading or index building do not wait
921+ until the very final commit message. All two-phase commit actions
922+ require commit waits, including both prepare and commit.
923+ </para>
924+
925+ <sect3 id="synchronous-replication-config">
926+ <title>Basic Configuration</title>
927+
928+ <para>
929+ All parameters have useful default values, so we can enable
930+ synchronous replication easily just by setting this on the primary
931+
932+ <programlisting>
933+ synchronous_replication = on
934+ </programlisting>
935+
936+ When <varname>synchronous_replication</> is set, a commit will wait
937+ for confirmation that the standby has received the commit record,
938+ even if that takes a very long time.
939+ <varname>synchronous_replication</> can be set by individual
940+ users, so can be configured in the configuration file, for particular
941+ users or databases, or dynamically by applications programs.
942+ </para>
943+
944+ <para>
945+ After a commit record has been written to disk on the primary the
946+ WAL record is then sent to the standby. The standby sends reply
947+ messages each time a new batch of WAL data is received, unless
948+ <varname>wal_receiver_status_interval</> is set to zero on the standby.
949+ If the standby is the first matching standby, as specified in
950+ <varname>synchronous_standby_names</> on the primary, the reply
951+ messages from that standby will be used to wake users waiting for
952+ confirmation the commit record has been received. These parameters
953+ allow the administrator to specify which standby servers should be
954+ synchronous standbys. Note that the configuration of synchronous
955+ replication is mainly on the master.
956+ </para>
957+
958+ <para>
959+ Users will stop waiting if a fast shutdown is requested, though the
960+ server does not fully shutdown until all outstanding WAL records are
961+ transferred to standby servers.
962+ </para>
963+
964+ <para>
965+ Note also that <varname>synchronous_commit</> is used when the user
966+ specifies <varname>synchronous_replication</>, overriding even an
967+ explicit setting of <varname>synchronous_commit</> to <literal>off</>.
968+ This is because we must write WAL to disk on primary before we replicate
969+ to ensure the standby never gets ahead of the primary.
970+ </para>
971+
972+ </sect3>
973+
974+ <sect3 id="synchronous-replication-performance">
975+ <title>Planning for Performance</title>
976+
977+ <para>
978+ Synchronous replication usually requires carefully planned and placed
979+ standby servers to ensure applications perform acceptably. Waiting
980+ doesn't utilise system resources, but transaction locks continue to be
981+ held until the transfer is confirmed. As a result, incautious use of
982+ synchronous replication will reduce performance for database
983+ applications because of increased response times and higher contention.
984+ </para>
985+
986+ <para>
987+ <productname>PostgreSQL</> allows the application developer
988+ to specify the durability level required via replication. This can be
989+ specified for the system overall, though it can also be specified for
990+ specific users or connections, or even individual transactions.
991+ </para>
992+
993+ <para>
994+ For example, an application workload might consist of:
995+ 10% of changes are important customer details, while
996+ 90% of changes are less important data that the business can more
997+ easily survive if it is lost, such as chat messages between users.
998+ </para>
999+
1000+ <para>
1001+ With synchronous replication options specified at the application level
1002+ (on the primary) we can offer sync rep for the most important changes,
1003+ without slowing down the bulk of the total workload. Application level
1004+ options are an important and practical tool for allowing the benefits of
1005+ synchronous replication for high performance applications.
1006+ </para>
1007+
1008+ <para>
1009+ You should consider that the network bandwidth must be higher than
1010+ the rate of generation of WAL data.
1011+ 10% of changes are important customer details, while
1012+ 90% of changes are less important data that the business can more
1013+ easily survive if it is lost, such as chat messages between users.
1014+ </para>
1015+
1016+ </sect3>
1017+
1018+ <sect3 id="synchronous-replication-ha">
1019+ <title>Planning for High Availability</title>
1020+
1021+ <para>
1022+ Commits made when synchronous_replication is set will wait until at
1023+ the sync standby responds. The response may never occur if the last,
1024+ or only, standby should crash.
1025+ </para>
1026+
1027+ <para>
1028+ The best solution for avoiding data loss is to ensure you don't lose
1029+ your last remaining sync standby. This can be achieved by naming multiple
1030+ potential synchronous standbys using <varname>synchronous_standby_names</>.
1031+ The first named standby will be used as the synchronous standby. Standbys
1032+ listed after this will takeover the role of synchronous standby if the
1033+ first one should fail.
1034+ </para>
1035+
1036+ <para>
1037+ When a standby first attaches to the primary, it will not yet be properly
1038+ synchronized. This is described as <literal>CATCHUP</> mode. Once
1039+ the lag between standby and primary reaches zero for the first time
1040+ we move to real-time <literal>STREAMING</> state.
1041+ The catch-up duration may be long immediately after the standby has
1042+ been created. If the standby is shutdown, then the catch-up period
1043+ will increase according to the length of time the standby has been down.
1044+ The standby is only able to become a synchronous standby
1045+ once it has reached <literal>STREAMING</> state.
1046+ </para>
1047+
1048+ <para>
1049+ If primary restarts while commits are waiting for acknowledgement, those
1050+ waiting transactions will be marked fully committed once the primary
1051+ database recovers.
1052+ There is no way to be certain that all standbys have received all
1053+ outstanding WAL data at time of the crash of the primary. Some
1054+ transactions may not show as committed on the standby, even though
1055+ they show as committed on the primary. The guarantee we offer is that
1056+ the application will not receive explicit acknowledgement of the
1057+ successful commit of a transaction until the WAL data is known to be
1058+ safely received by the standby.
1059+ </para>
1060+
1061+ <para>
1062+ If you really do lose your last standby server then you should disable
1063+ <varname>synchronous_standby_names</> and restart the primary server.
1064+ </para>
1065+
1066+ <para>
1067+ If the primary is isolated from remaining standby severs you should
1068+ failover to the best candidate of those other remaining standby servers.
1069+ </para>
1070+
1071+ <para>
1072+ If you need to re-create a standby server while transactions are
1073+ waiting, make sure that the commands to run pg_start_backup() and
1074+ pg_stop_backup() are run in a session with
1075+ synchronous_replication = off, otherwise those requests will wait
1076+ forever for the standby to appear.
1077+ </para>
1078+
1079+ </sect3>
1080+ </sect2>
8781081 </sect1>
8791082
8801083 <sect1 id="warm-standby-failover">