@@ -205,7 +205,7 @@ multimaster.conn_strings = 'dbname=mydb user=myuser host=node1, dbname=mydb user
205205 </listitem>
206206 <listitem>
207207 <para>
208- Allow replication in <literal >pg_hba.conf</literal >:
208+ Allow replication in <filename >pg_hba.conf</filename >:
209209 </para>
210210 <programlisting>
211211host myuser all node1 trust
@@ -517,7 +517,7 @@ pg_ctl -D ./datadir -l ./pg.log start
517517 </listitem>
518518 <listitem>
519519 <para>
520- Make sure the <literal >pg_hba.conf</literal > file allows
520+ Make sure the <filename >pg_hba.conf</filename > file allows
521521 replication to the new node.
522522 <programlisting>host replication all node3 trust</programlisting>
523523 </para>
@@ -866,78 +866,131 @@ pg_ctl -D ./datadir -l ./pg.log start
866866 <itemizedlist>
867867 <listitem>
868868 <para>
869- <literal>id</literal>, <type>integer</type>
869+ <parameter>id</parameter>, <type>integer</type>
870+ </para>
871+ <para>Node ID.
872+ </para>
873+ </listitem>
874+ <listitem>
875+ <para>
876+ <parameter>enabled</parameter>, <type>boolean</type>
877+ </para>
878+ <para>Shows whether the node is excluded from the cluster. The node can only be disabled if responses to heartbeats are not received within the <varname>heartbeat_recv_timeout</> time interval. When the node starts responding to heartbeats, <filename>multimaster</filename> can automatically restore the node and switch it back to the enabled state.
879+ Automatic recovery is only possible if the replication slot is still active. Otherwise, you can <link linkend="multimaster-restoring-a-node-manually">restore the node manually</link>.</para>
880+ </listitem>
881+ <listitem>
882+ <para>
883+ <parameter>connected</parameter>, <type>boolean</type>
884+ </para>
885+ <para>
886+ Shows whether the node is connected to the WAL sender.
870887 </para>
871888 </listitem>
872889 <listitem>
873890 <para>
874- <literal>disabled</literal>, <type>boolean</type>
891+ <parameter>slot_active</parameter>, <type>boolean</type>
892+ </para>
893+ <para>Shows whether the node has an active replication slot. For a disabled node, the slot remains active until the <varname>max_recovery_lag</varname> value is reached.
875894 </para>
876895 </listitem>
877896 <listitem>
878897 <para>
879- <literal>disconnected</literal>, <type>boolean</type>
898+ <parameter>stopped</parameter>, <type>boolean</type>
899+ </para>
900+ <para>Shows whether replication to this node was stopped by the <function>mtm.stop_node()</function> function. A stopped node acts as a disabled one, but cannot be automatically recovered. Call <function>mtm.recover_node()</function> to re-enable such a node.
880901 </para>
881902 </listitem>
882903 <listitem>
883904 <para>
884- <literal>catchUp</literal>, <type>bool</type>
905+ <parameter>catchUp</parameter>, <type>boolean</type>
906+ </para>
907+ <para>During the node recovery, shows whether the data is recovered up to the <varname>min_recovery_lag</varname> value.
885908 </para>
886909 </listitem>
887910 <listitem>
888911 <para>
889- <literal>slotLag</literal>, <type>bigint</type>
912+ <parameter>slotLag</parameter>, <type>bigint</type>
913+ </para>
914+ <para>The size of WAL data that the replication slot holds for a disabled/stopped node. The slot is dropped when <literal>slotLag</literal> reaches the <literal>max_recovery_lag</literal> value.
890915 </para>
891916 </listitem>
892917 <listitem>
893918 <para>
894- <literal>avgTransDelay</literal>, <type>bigint</type>
919+ <parameter>avgTransDelay</parameter>, <type>bigint</type>
920+ </para>
921+ <para>An average commit delay caused by this node, in microseconds.
895922 </para>
896923 </listitem>
897924 <listitem>
898925 <para>
899- <literal >lastStatusChange</literal >, <type>timestamp</type>
926+ <parameter >lastStatusChange</parameter >, <type>timestamp</type>
900927 </para>
928+ <para>Last time when the node changed its status (enabled/disabled).</para>
901929 </listitem>
902930 <listitem>
903931 <para>
904- <literal >oldestSnapshot</literal >, <type>bigint</type>
932+ <parameter >oldestSnapshot</parameter >, <type>bigint</type>
905933 </para>
934+ <para>The oldest global snapshot existing on this node.</para>
906935 </listitem>
907936 <listitem>
908937 <para>
909- <literal >SenderPid</literal> <type>integer</type>
938+ <parameter >SenderPid</parameter>, <type>integer</type>
910939 </para>
940+ <para>Process ID of the WAL sender.</para>
911941 </listitem>
912942 <listitem>
913943 <para>
914- <literal >SenderStartTime</literal> <type>timestamp</type>
944+ <parameter >SenderStartTime</parameter>, <type>timestamp</type>
915945 </para>
946+ <para>WAL sender start time.</para>
916947 </listitem>
917948 <listitem>
918949 <para>
919- <literal >ReceiverPid</literal> <type>integer</type>
950+ <parameter >ReceiverPid</parameter>, <type>integer</type>
920951 </para>
952+ <para>Process ID of the WAL receiver.</para>
921953 </listitem>
922954 <listitem>
923955 <para>
924- <literal >ReceiverStartTime</literal> <type>timestamp</type>
956+ <parameter >ReceiverStartTime</parameter>, <type>timestamp</type>
925957 </para>
958+ <para>WAL receiver start time.</para>
926959 </listitem>
927960 <listitem>
928961 <para>
929- <literal >connStr</literal> <type>text</type>
962+ <parameter >connStr</parameter>, <type>text</type>
930963 </para>
964+ <para>Connection string to this node.</para>
931965 </listitem>
932966 <listitem>
933967 <para>
934- <literal >connectivityMask</literal> <type>bigint</type>
968+ <parameter >connectivityMask</parameter>, <type>bigint</type>
935969 </para>
970+ <para>Bitmask representing connectivity to neighbor nodes. Each bit represents a connection to node.</para>
971+ </listitem>
972+ <listitem>
973+ <para><parameter>nHeartbeats</parameter>, <type>integer</type></para>
974+ <para>The number of heartbeat responses received from this node.</para>
936975 </listitem>
937976 </itemizedlist>
938977 </para>
939978 </listitem>
940979 </varlistentry>
980+
981+ <varlistentry>
982+ <term>
983+ <function>mtm.collect_cluster_state()</function>
984+ <indexterm>
985+ <primary><function>mtm.collect_cluster_state</></primary>
986+ </indexterm>
987+ </term>
988+ <listitem>
989+ <para>Collects the data returned by the <function>mtm.get_cluster_state()</function> function from all available nodes. For this function to work, in addition to replication connections, <filename>pg_hba.conf</filename> must allow ordinary connections to the node with the specified connection string.
990+ </para>
991+ </listitem>
992+ </varlistentry>
993+
941994 <varlistentry>
942995 <term>
943996 <function>mtm.get_cluster_state()</function>
@@ -946,87 +999,124 @@ pg_ctl -D ./datadir -l ./pg.log start
946999 </indexterm>
9471000 </term>
9481001 <listitem>
949- <para>Shows the status of thewhole cluster . Returns a tuple of the following values:
1002+ <para>Shows the status of the<filename>multimaster</filename> extension . Returns a tuple of the following values:
9501003 </para>
9511004 <itemizedlist>
9521005 <listitem>
9531006 <para>
954- <literal>status</literal>, <type>text</type>
1007+ <parameter>status</parameter>, <type>text</type>
1008+ </para>
1009+ <para>Node status. Possible values are: <literal>Initialization</literal>, <literal>Offline</literal>, <literal>Connected</literal>, <literal>Online</literal>, <literal>Recovery</literal>, <literal>Recovered</literal>, <literal>InMinor</literal>, <literal>OutOfService</literal>.</para>
1010+ </listitem>
1011+ <listitem>
1012+ <para>
1013+ <parameter>disabledNodeMask</parameter>, <type>bigint</type>
9551014 </para>
1015+ <para>Bitmask of disabled nodes.</para>
9561016 </listitem>
9571017 <listitem>
9581018 <para>
959- <literal>disabledNodeMask</literal >, <type>bigint</type>
1019+ <parameter>disconnectedNodeMask</parameter >, <type>bigint</type>
9601020 </para>
1021+ <para>Bitmask of disconnected nodes.</para>
9611022 </listitem>
9621023 <listitem>
9631024 <para>
964- <literal>disconnectedNodeMask</literal >, <type>bigint</type>
1025+ <parameter>catchUpNodeMask</parameter >, <type>bigint</type>
9651026 </para>
1027+ <para>Bitmask of nodes that completed the recovery.</para>
9661028 </listitem>
9671029 <listitem>
9681030 <para>
969- <literal>catchUpNodeMask</literal >, <type>bigint </type>
1031+ <parameter>liveNodes</parameter >, <type>integer </type>
9701032 </para>
1033+ <para>Number of enabled nodes.</para>
9711034 </listitem>
9721035 <listitem>
9731036 <para>
974- <literal>liveNodes</literal >, <type>integer</type>
1037+ <parameter>allNodes</parameter >, <type>integer</type>
9751038 </para>
1039+ <para>Number of nodes in the cluster. The majority of alive nodes is calculated based on this parameter.</para>
9761040 </listitem>
9771041 <listitem>
9781042 <para>
979- <literal>allNodes</literal >, <type>integer</type>
1043+ <parameter>nActiveQueries</parameter >, <type>integer</type>
9801044 </para>
1045+ <para>Number of queries being currently processed on this node.</para>
9811046 </listitem>
9821047 <listitem>
9831048 <para>
984- <literal>nActiveQueries</literal >, <type>integer</type>
1049+ <parameter>nPendingQueries</parameter >, <type>integer</type>
9851050 </para>
1051+ <para>Number of queries waiting for execution on this node.</para>
9861052 </listitem>
9871053 <listitem>
9881054 <para>
989- <literal>nPendingQueries</literal >, <type>integer </type>
1055+ <parameter>queueSize</parameter >, <type>bigint </type>
9901056 </para>
1057+ <para>Size of the pending query queue, in bytes.</para>
9911058 </listitem>
9921059 <listitem>
9931060 <para>
994- <literal>queueSize</literal >, <type>bigint</type>
1061+ <parameter>transCount</parameter >, <type>bigint</type>
9951062 </para>
1063+ <para>The total number of replicated transactions processed by this node.</para>
9961064 </listitem>
9971065 <listitem>
9981066 <para>
999- <literal>transCount</literal >, <type>bigint</type>
1067+ <parameter>timeShift</parameter >, <type>bigint</type>
10001068 </para>
1069+ <para>Global snapshot shift caused by unsynchronized clocks on nodes, in microseconds.</para>
10011070 </listitem>
10021071 <listitem>
10031072 <para>
1004- <literal>timeShift</literal >, <type>bigint </type>
1073+ <parameter>recoverySlot</parameter >, <type>integer </type>
10051074 </para>
1075+ <para>The node from which a failed node gets data updates during automatic recovery.</para>
10061076 </listitem>
10071077 <listitem>
10081078 <para>
1009- <literal>recoverySlot</literal >, <type>integer </type>
1079+ <parameter>xidHashSize</parameter >, <type>bigint </type>
10101080 </para>
1081+ <para>Size of xid2state hash.</para>
10111082 </listitem>
10121083 <listitem>
10131084 <para>
1014- <literal>xidHashSize</literal >, <type>bigint</type>
1085+ <parameter>gidHashSize</parameter >, <type>bigint</type>
10151086 </para>
1087+ <para>Size of gid2state hash.</para>
10161088 </listitem>
10171089 <listitem>
10181090 <para>
1019- <literal>gidHashSize</literal >, <type>bigint</type>
1091+ <parameter>oldestXid</parameter >, <type>bigint</type>
10201092 </para>
1093+ <para>The oldest transaction ID on this node.</para>
10211094 </listitem>
10221095 <listitem>
10231096 <para>
1024- <literal>oldestXid</literal >, <type>bigint </type>
1097+ <parameter>configChanges</parameter >, <type>integer </type>
10251098 </para>
1099+ <para>Number of state changes (enabled/disabled) since the last reboot.</para>
10261100 </listitem>
10271101 <listitem>
10281102 <para>
1029- <literal>configChanges</literal>, <type>integer</type>
1103+ <parameter>stalledNodeMask</parameter>, <type>biint</type>
1104+ </para>
1105+ <para>Bitmask of nodes for which replication slots were dropped.
1106+ </para>
1107+ </listitem>
1108+ <listitem>
1109+ <para>
1110+ <parameter>stoppedNodeMask</parameter>, <type>bigint</type>
1111+ </para>
1112+ <para>Bitmask of nodes that were stopped by <function>mtm.stop_node()</function>.
1113+ </para>
1114+ </listitem>
1115+ <listitem>
1116+ <para>
1117+ <parameter>lastStatusChange</parameter>, <type>timestamp</type>
1118+ </para>
1119+ <para>Timestamp of the last state change.
10301120 </para>
10311121 </listitem>
10321122 </itemizedlist>
@@ -1153,7 +1243,7 @@ pg_ctl -D ./datadir -l ./pg.log start
11531243 <para>
11541244 The <filename>multimaster</filename> extension currently passes 162
11551245 of 166 <productname>PostgreSQL</productname> regression tests. We are working right now on
1156- proving full compatibility with the standard <productname>PostgreSQL</productname>.
1246+ providing full compatibility with the standard <productname>PostgreSQL</productname>.
11571247 </para>
11581248 </sect2>
11591249 <sect2 id="multimaster-authors">