1- <!-- $PostgreSQL: pgsql/doc/src/sgml/high-availability.sgml,v 1.17 2007/11/04 19:23:24 momjian Exp $ -->
1+ <!-- $PostgreSQL: pgsql/doc/src/sgml/high-availability.sgml,v 1.18 2007/11/08 19:16:30 momjian Exp $ -->
22
33<chapter id="high-availability">
44 <title>High Availability, Load Balancing, and Replication</title>
9292 </para>
9393
9494 <para>
95- Shared hardware functionality is common in network storage
96- devices. Using a network file system is also possible, though
97- care must be taken that the file system has full POSIX behavior.
98- One significant limitation of this method is that if the shared
99- disk array fails or becomes corrupt, the primary and standby
100- servers are both nonfunctional. Another issue is that the
101- standby server should never access the shared storage while
95+ Shared hardware functionality is common in network storage devices.
96+ Using a network file system is also possible, though care must be
97+ taken that the file system has full POSIX behavior (see <xref
98+ linkend="creating-cluster-nfs">). One significant limitation of this
99+ method is that if the shared disk array fails or becomes corrupt, the
100+ primary and standby servers are both nonfunctional. Another issue is
101+ that the standby server should never access the shared storage while
102102 the primary server is running.
103103 </para>
104104
105+ </listitem>
106+ </varlistentry>
107+
108+ <varlistentry>
109+ <term>File System Replication</term>
110+ <listitem>
111+
105112 <para>
106113 A modified version of shared hardware functionality is file system
107114 replication, where all changes to a file system are mirrored to a file
@@ -125,7 +132,7 @@ protocol to make nodes agree on a serializable transactional order.
125132 </varlistentry>
126133
127134 <varlistentry>
128- <term>Warm Standby Using Point-In-Time Recovery</term>
135+ <term>Warm Standby Using Point-In-Time Recovery (<acronym>PITR</>) </term>
129136 <listitem>
130137
131138 <para>
@@ -190,6 +197,21 @@ protocol to make nodes agree on a serializable transactional order.
190197 </listitem>
191198 </varlistentry>
192199
200+ <varlistentry>
201+ <term>Asynchronous Multi-Master Replication</term>
202+ <listitem>
203+
204+ <para>
205+ For servers that are not regularly connected, like laptops or
206+ remote servers, keeping data consistent among servers is a
207+ challenge. Using asynchronous multi-master replication, each
208+ server works independently, and periodically communicates with
209+ the other servers to identify conflicting transactions. The
210+ conflicts can be resolved by users or conflict resolution rules.
211+ </para>
212+ </listitem>
213+ </varlistentry>
214+
193215 <varlistentry>
194216 <term>Synchronous Multi-Master Replication</term>
195217 <listitem>
@@ -222,21 +244,6 @@ protocol to make nodes agree on a serializable transactional order.
222244 </listitem>
223245 </varlistentry>
224246
225- <varlistentry>
226- <term>Asynchronous Multi-Master Replication</term>
227- <listitem>
228-
229- <para>
230- For servers that are not regularly connected, like laptops or
231- remote servers, keeping data consistent among servers is a
232- challenge. Using asynchronous multi-master replication, each
233- server works independently, and periodically communicates with
234- the other servers to identify conflicting transactions. The
235- conflicts can be resolved by users or conflict resolution rules.
236- </para>
237- </listitem>
238- </varlistentry>
239-
240247 <varlistentry>
241248 <term>Data Partitioning</term>
242249 <listitem>
@@ -253,23 +260,6 @@ protocol to make nodes agree on a serializable transactional order.
253260 </listitem>
254261 </varlistentry>
255262
256- <varlistentry>
257- <term>Multi-Server Parallel Query Execution</term>
258- <listitem>
259-
260- <para>
261- Many of the above solutions allow multiple servers to handle
262- multiple queries, but none allow a single query to use multiple
263- servers to complete faster. This solution allows multiple
264- servers to work concurrently on a single query. This is usually
265- accomplished by splitting the data among servers and having
266- each server execute its part of the query and return results
267- to a central server where they are combined and returned to
268- the user. Pgpool-II has this capability.
269- </para>
270- </listitem>
271- </varlistentry>
272-
273263 <varlistentry>
274264 <term>Commercial Solutions</term>
275265 <listitem>
@@ -285,4 +275,139 @@ protocol to make nodes agree on a serializable transactional order.
285275
286276 </variablelist>
287277
278+ <para>
279+ The table below (<xref linkend="high-availability-matrix">) summarizes
280+ the capabilities of the various solutions listed above.
281+ </para>
282+
283+ <table id="high-availability-matrix">
284+ <title>High Availability, Load Balancing, and Replication Feature Matrix</title>
285+ <tgroup cols="9">
286+ <thead>
287+ <row>
288+ <entry>Feature</entry>
289+ <entry>Shared Disk Failover</entry>
290+ <entry>File System Replication</entry>
291+ <entry>Warm Standby Using PITR</entry>
292+ <entry>Master-Slave Replication</entry>
293+ <entry>Statement-Based Replication Middleware</entry>
294+ <entry>Asynchronous Multi-Master Replication</entry>
295+ <entry>Synchronous Multi-Master Replication</entry>
296+ <entry>Data Partitioning</entry>
297+ </row>
298+ </thead>
299+
300+ <tbody>
301+
302+ <row>
303+ <entry>No special hardware required</entry>
304+ <entry align="center"></entry>
305+ <entry align="center">•</entry>
306+ <entry align="center">•</entry>
307+ <entry align="center">•</entry>
308+ <entry align="center">•</entry>
309+ <entry align="center">•</entry>
310+ <entry align="center">•</entry>
311+ <entry align="center">•</entry>
312+ </row>
313+
314+ <row>
315+ <entry>Allows multiple master servers</entry>
316+ <entry align="center"></entry>
317+ <entry align="center"></entry>
318+ <entry align="center"></entry>
319+ <entry align="center"></entry>
320+ <entry align="center">•</entry>
321+ <entry align="center">•</entry>
322+ <entry align="center">•</entry>
323+ <entry align="center"></entry>
324+ </row>
325+
326+ <row>
327+ <entry>No master server overhead</entry>
328+ <entry align="center">•</entry>
329+ <entry align="center"></entry>
330+ <entry align="center">•</entry>
331+ <entry align="center"></entry>
332+ <entry align="center"></entry>
333+ <entry align="center"></entry>
334+ <entry align="center"></entry>
335+ <entry align="center"></entry>
336+ </row>
337+
338+ <row>
339+ <entry>Master server never locks others</entry>
340+ <entry align="center">•</entry>
341+ <entry align="center">•</entry>
342+ <entry align="center">•</entry>
343+ <entry align="center">•</entry>
344+ <entry align="center">•</entry>
345+ <entry align="center">•</entry>
346+ <entry align="center"></entry>
347+ <entry align="center">•</entry>
348+ </row>
349+
350+ <row>
351+ <entry>Master failure will never lose data</entry>
352+ <entry align="center">•</entry>
353+ <entry align="center">•</entry>
354+ <entry align="center"></entry>
355+ <entry align="center"></entry>
356+ <entry align="center">•</entry>
357+ <entry align="center"></entry>
358+ <entry align="center">•</entry>
359+ <entry align="center"></entry>
360+ </row>
361+
362+ <row>
363+ <entry>Slaves accept read-only queries</entry>
364+ <entry align="center"></entry>
365+ <entry align="center"></entry>
366+ <entry align="center"></entry>
367+ <entry align="center">•</entry>
368+ <entry align="center">•</entry>
369+ <entry align="center">•</entry>
370+ <entry align="center">•</entry>
371+ <entry align="center">•</entry>
372+ </row>
373+
374+ <row>
375+ <entry>Per-table granularity</entry>
376+ <entry align="center"></entry>
377+ <entry align="center"></entry>
378+ <entry align="center"></entry>
379+ <entry align="center">•</entry>
380+ <entry align="center"></entry>
381+ <entry align="center">•</entry>
382+ <entry align="center">•</entry>
383+ <entry align="center">•</entry>
384+ </row>
385+
386+ <row>
387+ <entry>No conflict resolution necessary</entry>
388+ <entry align="center">•</entry>
389+ <entry align="center">•</entry>
390+ <entry align="center">•</entry>
391+ <entry align="center">•</entry>
392+ <entry align="center"></entry>
393+ <entry align="center"></entry>
394+ <entry align="center">•</entry>
395+ <entry align="center">•</entry>
396+ </row>
397+
398+ </tbody>
399+ </tgroup>
400+ </table>
401+
402+ <para>
403+ Many of the above solutions allow multiple servers to handle multiple
404+ queries, but none allow a single query to use multiple servers to
405+ complete faster. Multi-server parallel query execution allows multiple
406+ servers to work concurrently on a single query. This is usually
407+ accomplished by splitting the data among servers and having each server
408+ execute its part of the query and return results to a central server
409+ where they are combined and returned to the user. Pgpool-II has this
410+ capability.
411+ </para>
412+
288413</chapter>