Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commite444329

Browse files
author
Liudmila Mantrova
committed
DOC: major node and referee documentation for multimaster
1 parentd23f5b8 commite444329

File tree

1 file changed

+157
-21
lines changed

1 file changed

+157
-21
lines changed

‎doc/src/sgml/multimaster.sgml

Lines changed: 157 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -143,7 +143,7 @@
143143
</listitem>
144144
</itemizedlist>
145145
<para>If you have any data that must be present on one of the nodes only, you can exclude a particular table from replication, as follows:
146-
<programlisting><function>mtm.make_table_local</function>('table_name') </programlisting>
146+
<programlisting>SELECTmtm.make_table_local('table_name') </programlisting>
147147
</para>
148148
</sect2>
149149

@@ -266,6 +266,12 @@
266266
of 2<replaceable>N</replaceable>+1 nodes can tolerate <replaceable>N</replaceable> node failures and stay alive if any
267267
<replaceable>N</replaceable>+1 nodes are alive and connected to each other.
268268
</para>
269+
<tip>
270+
<para>
271+
For clusters with an even number of nodes, you can override this
272+
behavior. For details, see <xref linkend="multimaster-quorum-settings">.
273+
</para>
274+
</tip>
269275
<para>
270276
In case of a partial network split when different nodes have
271277
different connectivity, <filename>multimaster</filename> finds a
@@ -274,18 +280,11 @@
274280
C, but node B cannot access node C, <filename>multimaster</filename>
275281
isolates node C to ensure data consistency on nodes A and B.
276282
</para>
277-
<note>
278-
<para>
279-
If you try to access a disconnected node, <filename>multimaster</filename> returns an error
280-
message indicating the current status of the node. To prevent stale reads, read-only queries are also forbidden.
281-
Additionally, you can break connections between the disconnected node and the clients using the
282-
<link linkend="mtm-break-connection"><varname>multimaster.break_connection</varname></link> variable.
283-
</para>
284-
</note>
285283
<para>
286-
If required, you can override this behavior for one of the nodes using the
287-
<link linkend="mtm-major-node"><varname>multimaster.major_node</varname></link> variable.
288-
In this case, the node will continue working even if it is isolated.
284+
If you try to access a disconnected node, <filename>multimaster</filename> returns an error
285+
message indicating the current status of the node. To prevent stale reads, read-only queries are also forbidden.
286+
Additionally, you can break connections between the disconnected node and the clients using the
287+
<link linkend="mtm-break-connection"><varname>multimaster.break_connection</varname></link> variable.
289288
</para>
290289
<para>
291290
Each node maintains a data structure that keeps the information about the state of all
@@ -339,7 +338,7 @@
339338
<para>
340339
To use <filename>multimaster</filename>, you need to install
341340
<productname>&productname;</productname> on all nodes of your cluster. <productname>&productname;</productname> includes all the required dependencies and
342-
extensions.
341+
extensions.
343342
</para>
344343
<sect3 id="multimaster-setting-up-a-multi-master-cluster">
345344
<title>Setting up a Multi-Master Cluster</title>
@@ -606,6 +605,133 @@ SELECT mtm.get_cluster_state();
606605
<para><link linkend="multimaster-guc-variables">GUC Variables</link></para>
607606
</sect4>
608607
</sect3>
608+
<sect3 id="multimaster-quorum-settings">
609+
<title>Defining Quorum Settings for Clusters with an Even Number of Nodes</title>
610+
<para>
611+
By default, <filename>multimaster</filename> uses a majority-based
612+
algorithm to determine whether the cluster nodes have a quorum: a cluster
613+
can only continue working if the majority of its nodes are alive and can
614+
access each other. For clusters with an even number of nodes, this
615+
approach is not optimal. For example, if a network failure splits the
616+
cluster into equal parts, or one of the nodes fails in a two-node
617+
cluster, all the nodes stop accepting queries, even though at least
618+
half of the cluster nodes are running normally.
619+
</para>
620+
<para>
621+
To enable a smooth failover for such cases, you can modify the
622+
<filename>multimaster</filename> majority-based behavior using one
623+
of the following options:
624+
<itemizedlist spacing="compact">
625+
<listitem>
626+
<para>
627+
<link linkend="setting-up-a-referee">Set up a standalone <firstterm>referee</> node</link>
628+
to assign the quorum status to a subset of nodes that constitutes half of the cluster.
629+
</para>
630+
</listitem>
631+
<listitem>
632+
<para>
633+
<link linkend="configuring-the-major-node">Choose the <firstterm>major node</></link>
634+
that continues working regardless of the status of other nodes.
635+
Use this option in two-node cluster configurations only.
636+
</para>
637+
</listitem>
638+
</itemizedlist>
639+
<important>
640+
<para>
641+
To avoid split-brain problems, do not use the major node together
642+
with a referee in the same cluster.
643+
</para>
644+
</important>
645+
</para>
646+
<sect4 id="setting-up-a-referee">
647+
<title>Setting up a Standalone Referee Node</title>
648+
<para>
649+
A <firstterm>referee</> is a voting node used to determine which subset
650+
of nodes has a quorum if the cluster is split into equal parts. The
651+
referee node does not store any cluster data, so it is not
652+
resource-intensive and can be configured on virtually any system with
653+
<productname>&productname;</productname> installed.
654+
</para>
655+
<para>
656+
To set up a referee for your cluster:
657+
<orderedlist>
658+
<listitem>
659+
<para>
660+
Install <productname>&productname;</productname> on the node you are
661+
going to make a referee and create the <filename>referee</filename>
662+
extension:
663+
<programlisting>
664+
CREATE EXTENSION referee;
665+
</programlisting>
666+
</para>
667+
</listitem>
668+
<listitem>
669+
<para>
670+
Make sure the <filename>pg_hba.conf</filename> file allows
671+
access to the referee node.
672+
</para>
673+
</listitem>
674+
<listitem>
675+
<para>
676+
On all your cluster nodes, specify the referee connection string
677+
in the <filename>postgresql.conf</> file:
678+
<programlisting>
679+
multimaster.referee_connstring = <replaceable>connstring</>
680+
</programlisting>
681+
where <replaceable>connstring</> holds <link linkend="libpq-paramkeywords">libpq options</link>
682+
required to access the referee.
683+
</para>
684+
</listitem>
685+
</orderedlist>
686+
</para>
687+
<para>
688+
The first subset of nodes that gets connected to the referee wins the voting
689+
and continues working. The referee keeps the voting result until all the
690+
other cluster nodes get online again. Then the result is discarded, and
691+
a new winner can be chosen in case of another network failure.
692+
</para>
693+
<para>
694+
To avoid split-brain problems, you can only have a single referee
695+
in your cluster. Do not set up a referee if you have already
696+
<link linkend="configuring-the-major-node">configured the major node</link>.
697+
</para>
698+
</sect4>
699+
<sect4 id="configuring-the-major-node">
700+
<title>Configuring the Major Node</title>
701+
<para>
702+
If you configure one of the nodes to be the major one, this node
703+
will continue accepting queries even if it is isolated by a
704+
network failure, or other nodes get broken. This setting is useful
705+
in a two-node cluster configuration, or to quickly restore a
706+
single node in a broken cluster.
707+
</para>
708+
<important>
709+
<para>
710+
If your cluster has more than two nodes, promoting one of the
711+
nodes to the major status can lead to split-brain problems
712+
in case of network failures, and reduce the number of possible
713+
failover options. Consider
714+
<link linkend="setting-up-a-referee">setting up a standalone referee</link>
715+
instead.
716+
</para>
717+
</important>
718+
<para>
719+
To make one of the nodes major, enable the
720+
<literal>multimaster.major_node</literal> parameter on this node:
721+
<programlisting>
722+
ALTER SYSTEM SET multimaster.major_node TO on
723+
SELECT pg_reload_conf();
724+
</programlisting>
725+
</para>
726+
<para>
727+
Do not set the <varname>major_node</varname> parameter on more
728+
than one cluster node. When enabled on several nodes, it can
729+
cause the split-brain problem. If you have already set up a
730+
referee for your cluster, the <varname>major_node</varname>
731+
option is forbidden.
732+
</para>
733+
</sect4>
734+
</sect3>
609735
</sect2>
610736
<sect2 id="multimaster-administration"><title>Multi-Master Cluster Administration</title>
611737
<itemizedlist>
@@ -795,7 +921,7 @@ SELECT mtm.stop_node(3);
795921
set to <literal>true</literal>:
796922
</para>
797923
<programlisting>
798-
SELECT mtm.stop_node(3,drop_slottrue);
924+
SELECT mtm.stop_node(3, true);
799925
</programlisting>
800926
<para>
801927
This disables replication slots for node 3 on all cluster nodes and stops replication to
@@ -959,19 +1085,29 @@ pg_ctl -D <replaceable>datadir</replaceable> -l <replaceable>pg.log</replaceable
9591085
</indexterm>
9601086
</term>
9611087
<listitem>
962-
<para>Node with this flag continues working even if it cannot access the majority of other nodes.
963-
This is needed to break the symmetry if there is an even number of alive nodes in the cluster.
964-
For example, in a cluster of three nodes, if one of the nodes has crashed and
965-
the connection between the remaining nodes is lost, the node with <varname>multimaster.major_node</varname> = <literal>true</literal> will continue working.
1088+
<para>The node with this flag continues working even if it cannot access the majority of other nodes.
1089+
This may be required to break the symmetry in two-node clusters.
9661090
</para>
9671091
<important>
968-
<para>This parameter should be used with caution. Only one node in the cluster
969-
can have this parameter set to <literal>true</literal>. When set to <literal>true</literal> on several
970-
nodes, this parameter can cause the split-brain problem.
1092+
<para>This parameter should be used with caution. This parameter can cause the
1093+
split-brain problem if you use it on clusters with more than two nodes, or set
1094+
it to <literal>true</literal> on more than one node.
1095+
Only one node in the cluster can be the major node.
9711096
</para>
9721097
</important>
9731098
</listitem>
9741099
</varlistentry>
1100+
<varlistentry id="mtm-referee-connstring" xreflabel="multimaster.referee_connstring">
1101+
<term><varname>multimaster.referee_connstring</varname>
1102+
<indexterm><primary><varname>multimaster.referee_connstring</varname></primary>
1103+
</indexterm>
1104+
</term>
1105+
<listitem>
1106+
<para>Connection string to access the referee node. You must set this parameter
1107+
on all cluster nodes if the referee is set up.
1108+
</para>
1109+
</listitem>
1110+
</varlistentry>
9751111
<varlistentry>
9761112
<term><varname>multimaster.max_workers</varname>
9771113
<indexterm><primary><varname>multimaster.max_workers</varname></primary>

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp