@@ -687,6 +687,100 @@ ALTER SUBSCRIPTION
687687
688688 </sect1>
689689
690+ <sect1 id="logical-replication-failover">
691+ <title>Logical Replication Failover</title>
692+
693+ <para>
694+ To allow subscriber nodes to continue replicating data from the publisher
695+ node even when the publisher node goes down, there must be a physical standby
696+ corresponding to the publisher node. The logical slots on the primary server
697+ corresponding to the subscriptions can be synchronized to the standby server by
698+ specifying <literal>failover = true</literal> when creating subscriptions. See
699+ <xref linkend="logicaldecoding-replication-slots-synchronization"/> for details.
700+ Enabling the
701+ <link linkend="sql-createsubscription-params-with-failover"><literal>failover</literal></link>
702+ parameter ensures a seamless transition of those subscriptions after the
703+ standby is promoted. They can continue subscribing to publications on the
704+ new primary server without losing data. Note that in the case of
705+ asynchronous replication, there remains a risk of data loss for transactions
706+ committed on the former primary server but have yet to be replicated to the new
707+ primary server.
708+ </para>
709+
710+ <para>
711+ Because the slot synchronization logic copies asynchronously, it is
712+ necessary to confirm that replication slots have been synced to the standby
713+ server before the failover happens. To ensure a successful failover, the
714+ standby server must be ahead of the subscriber. This can be achieved by
715+ configuring
716+ <link linkend="guc-standby-slot-names"><varname>standby_slot_names</varname></link>.
717+ </para>
718+
719+ <para>
720+ To confirm that the standby server is indeed ready for failover, follow these
721+ steps to verify that all necessary logical replication slots have been
722+ synchronized to the standby server:
723+ </para>
724+
725+ <procedure>
726+ <step performance="required">
727+ <para>
728+ On the subscriber node, use the following SQL to identify which slots
729+ should be synced to the standby that we plan to promote. This query will
730+ return the relevant replication slots, including the main slots and table
731+ synchronization slots associated with the failover-enabled subscriptions.
732+ Note that the table sync slot should be synced to the standby server only
733+ if the table copy is finished (See <xref linkend="catalog-pg-subscription-rel"/>).
734+ We don't need to ensure that the table sync slots are synced in other scenarios
735+ as they will either be dropped or re-created on the new primary server in those
736+ cases.
737+ <programlisting>
738+ test_sub=# SELECT
739+ array_agg(slot_name) AS slots
740+ FROM
741+ ((
742+ SELECT r.srsubid AS subid, CONCAT('pg_', srsubid, '_sync_', srrelid, '_', ctl.system_identifier) AS slot_name
743+ FROM pg_control_system() ctl, pg_subscription_rel r, pg_subscription s
744+ WHERE r.srsubstate = 'f' AND s.oid = r.srsubid AND s.subfailover
745+ ) UNION (
746+ SELECT s.oid AS subid, s.subslotname as slot_name
747+ FROM pg_subscription s
748+ WHERE s.subfailover
749+ ))
750+ WHERE slot_name IS NOT NULL;
751+ slots
752+ -------
753+ {sub1,sub2,sub3}
754+ (1 row)
755+ </programlisting></para>
756+ </step>
757+ <step performance="required">
758+ <para>
759+ Check that the logical replication slots identified above exist on
760+ the standby server and are ready for failover.
761+ <programlisting>
762+ test_standby=# SELECT slot_name, (synced AND NOT temporary AND NOT conflicting) AS failover_ready
763+ FROM pg_replication_slots
764+ WHERE slot_name IN ('sub1','sub2','sub3');
765+ slot_name | failover_ready
766+ -------------+----------------
767+ sub1 | t
768+ sub2 | t
769+ sub3 | t
770+ (3 rows)
771+ </programlisting></para>
772+ </step>
773+ </procedure>
774+
775+ <para>
776+ If all the slots are present on the standby server and the result
777+ (<literal>failover_ready</literal>) of the above SQL query is true, then
778+ existing subscriptions can continue subscribing to publications now on the
779+ new primary server without losing data.
780+ </para>
781+
782+ </sect1>
783+
690784 <sect1 id="logical-replication-row-filter">
691785 <title>Row Filters</title>
692786