Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commite57cd7f

Browse files
committed
Change the logic to decide when to delete old WAL segments, so that it
doesn't take into account how far the WAL senders are. This way a hungWAL sender doesn't prevent old WAL segments from being recycled/removedin the primary, ultimately causing the disk to fill up. Instead addstandby_keep_segments setting to control how many old WAL segments arekept in the primary. This also makes it more reliable to use streamingreplication without WAL archiving, assuming that you setstandby_keep_segments high enough.
1 parent93f35f0 commite57cd7f

File tree

7 files changed

+174
-33
lines changed

7 files changed

+174
-33
lines changed

‎doc/src/sgml/config.sgml

Lines changed: 29 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.262 2010/04/03 07:22:53 petere Exp $ -->
1+
<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.263 2010/04/12 09:52:29 heikki Exp $ -->
22

33
<chapter Id="runtime-config">
44
<title>Server Configuration</title>
@@ -1823,6 +1823,34 @@ archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"' # Windows
18231823
</para>
18241824
</listitem>
18251825
</varlistentry>
1826+
1827+
<varlistentry id="guc-standby-keep-segments" xreflabel="standby_keep_segments">
1828+
<term><varname>standby_keep_segments</varname> (<type>integer</type>)</term>
1829+
<indexterm>
1830+
<primary><varname>standby_keep_segments</> configuration parameter</primary>
1831+
</indexterm>
1832+
<listitem>
1833+
<para>
1834+
Specifies the number of log file segments kept in <filename>pg_xlog</>
1835+
directory, in case a standby server needs to fetch them via streaming
1836+
replciation. Each segment is normally 16 megabytes. If a standby
1837+
server connected to the primary falls behind more than
1838+
<varname>standby_keep_segments</> segments, the primary might remove
1839+
a WAL segment still needed by the standby and the replication
1840+
connection will be terminated.
1841+
1842+
This sets only the minimum number of segments retained for standby
1843+
purposes, the system might need to retain more segments for WAL
1844+
archival or to recover from a checkpoint. If <varname>standby_keep_segments</>
1845+
is zero (the default), the system doesn't keep any extra segments
1846+
for standby purposes, and the number of old WAL segments available
1847+
for standbys is determined based only on the location of the previous
1848+
checkpoint and status of WAL archival.
1849+
This parameter can only be set in the <filename>postgresql.conf</>
1850+
file or on the server command line.
1851+
</para>
1852+
</listitem>
1853+
</varlistentry>
18261854
</variablelist>
18271855
</sect2>
18281856
<sect2 id="runtime-config-standby">

‎doc/src/sgml/high-availability.sgml

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<!-- $PostgreSQL: pgsql/doc/src/sgml/high-availability.sgml,v 1.58 2010/04/03 07:22:54 petere Exp $ -->
1+
<!-- $PostgreSQL: pgsql/doc/src/sgml/high-availability.sgml,v 1.59 2010/04/12 09:52:29 heikki Exp $ -->
22

33
<chapter id="high-availability">
44
<title>High Availability, Load Balancing, and Replication</title>
@@ -732,7 +732,12 @@ trigger_file = '/path/to/trigger_file'
732732
Streaming replication relies on file-based continuous archiving for
733733
making the base backup and for allowing the standby to catch up if it is
734734
disconnected from the primary for long enough for the primary to
735-
delete old WAL files still required by the standby.
735+
delete old WAL files still required by the standby. It is possible
736+
to use streaming replication without WAL archiving, but if a standby
737+
falls behind too much, the primary will delete old WAL files still
738+
needed by the standby, and the standby will have to be manually restored
739+
from a base backup. You can control how long the primary retains old WAL
740+
segments using the <varname>standby_keep_segments</> setting.
736741
</para>
737742

738743
<para>

‎src/backend/access/transam/xlog.c

Lines changed: 69 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
* Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
88
* Portions Copyright (c) 1994, Regents of the University of California
99
*
10-
* $PostgreSQL: pgsql/src/backend/access/transam/xlog.c,v 1.391 2010/04/07 10:58:49 heikki Exp $
10+
* $PostgreSQL: pgsql/src/backend/access/transam/xlog.c,v 1.392 2010/04/12 09:52:29 heikki Exp $
1111
*
1212
*-------------------------------------------------------------------------
1313
*/
@@ -66,6 +66,7 @@
6666

6767
/* User-settable parameters */
6868
intCheckPointSegments=3;
69+
intStandbySegments=0;
6970
intXLOGbuffers=8;
7071
intXLogArchiveTimeout=0;
7172
boolXLogArchiveMode= false;
@@ -356,6 +357,8 @@ typedef struct XLogCtlData
356357
uint32ckptXidEpoch;/* nextXID & epoch of latest checkpoint */
357358
TransactionIdckptXid;
358359
XLogRecPtrasyncCommitLSN;/* LSN of newest async commit */
360+
uint32lastRemovedLog;/* latest removed/recycled XLOG segment */
361+
uint32lastRemovedSeg;
359362

360363
/* Protected by WALWriteLock: */
361364
XLogCtlWriteWrite;
@@ -3149,6 +3152,22 @@ PreallocXlogFiles(XLogRecPtr endptr)
31493152
}
31503153
}
31513154

3155+
/*
3156+
* Get the log/seg of the latest removed or recycled WAL segment.
3157+
* Returns 0 if no WAL segments have been removed since startup.
3158+
*/
3159+
void
3160+
XLogGetLastRemoved(uint32*log,uint32*seg)
3161+
{
3162+
/* use volatile pointer to prevent code rearrangement */
3163+
volatileXLogCtlData*xlogctl=XLogCtl;
3164+
3165+
SpinLockAcquire(&xlogctl->info_lck);
3166+
*log=xlogctl->lastRemovedLog;
3167+
*seg=xlogctl->lastRemovedSeg;
3168+
SpinLockRelease(&xlogctl->info_lck);
3169+
}
3170+
31523171
/*
31533172
* Recycle or remove all log files older or equal to passed log/seg#
31543173
*
@@ -3170,6 +3189,20 @@ RemoveOldXlogFiles(uint32 log, uint32 seg, XLogRecPtr endptr)
31703189
charnewpath[MAXPGPATH];
31713190
#endif
31723191
structstatstatbuf;
3192+
/* use volatile pointer to prevent code rearrangement */
3193+
volatileXLogCtlData*xlogctl=XLogCtl;
3194+
3195+
/* Update the last removed location in shared memory first */
3196+
SpinLockAcquire(&xlogctl->info_lck);
3197+
if (log>xlogctl->lastRemovedLog||
3198+
(log==xlogctl->lastRemovedLog&&seg>xlogctl->lastRemovedSeg))
3199+
{
3200+
xlogctl->lastRemovedLog=log;
3201+
xlogctl->lastRemovedSeg=seg;
3202+
}
3203+
SpinLockRelease(&xlogctl->info_lck);
3204+
3205+
elog(DEBUG1,"removing WAL segments older than %X/%X",log,seg);
31733206

31743207
/*
31753208
* Initialize info about where to try to recycle to. We allow recycling
@@ -7172,36 +7205,51 @@ CreateCheckPoint(int flags)
71727205
smgrpostckpt();
71737206

71747207
/*
7175-
* If there's connected standby servers doing XLOG streaming, don't delete
7176-
* XLOG files that have not been streamed to all of them yet. This does
7177-
* nothing to prevent them from being deleted when the standby is
7178-
* disconnected (e.g because of network problems), but at least it avoids
7179-
* an open replication connection from failing because of that.
7208+
* Delete old log files (those no longer needed even for previous
7209+
* checkpoint or the standbys in XLOG streaming).
71807210
*/
7181-
if ((_logId||_logSeg)&&max_wal_senders>0)
7211+
if (_logId||_logSeg)
71827212
{
7183-
XLogRecPtroldest;
7184-
uint32log;
7185-
uint32seg;
7186-
7187-
oldest=GetOldestWALSendPointer();
7188-
if (oldest.xlogid!=0||oldest.xrecoff!=0)
7213+
/*
7214+
* Calculate the last segment that we need to retain because of
7215+
* standby_keep_segments, by subtracting StandbySegments from the
7216+
* new checkpoint location.
7217+
*/
7218+
if (StandbySegments>0)
71897219
{
7190-
XLByteToSeg(oldest,log,seg);
7220+
uint32log;
7221+
uint32seg;
7222+
intd_log;
7223+
intd_seg;
7224+
7225+
XLByteToSeg(recptr,log,seg);
7226+
7227+
d_seg=StandbySegments %XLogSegsPerFile;
7228+
d_log=StandbySegments /XLogSegsPerFile;
7229+
if (seg<d_seg)
7230+
{
7231+
d_log+=1;
7232+
seg=seg-d_seg+XLogSegsPerFile;
7233+
}
7234+
else
7235+
seg=seg-d_seg;
7236+
/* avoid underflow, don't go below (0,1) */
7237+
if (log<d_log|| (log==d_log&&seg==0))
7238+
{
7239+
log=0;
7240+
seg=1;
7241+
}
7242+
else
7243+
log=log-d_log;
7244+
7245+
/* don't delete WAL segments newer than the calculated segment */
71917246
if (log<_logId|| (log==_logId&&seg<_logSeg))
71927247
{
71937248
_logId=log;
71947249
_logSeg=seg;
71957250
}
71967251
}
7197-
}
71987252

7199-
/*
7200-
* Delete old log files (those no longer needed even for previous
7201-
* checkpoint or the standbys in XLOG streaming).
7202-
*/
7203-
if (_logId||_logSeg)
7204-
{
72057253
PrevLogSeg(_logId,_logSeg);
72067254
RemoveOldXlogFiles(_logId,_logSeg,recptr);
72077255
}

‎src/backend/replication/walsender.c

Lines changed: 55 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@
3030
*
3131
*
3232
* IDENTIFICATION
33-
* $PostgreSQL: pgsql/src/backend/replication/walsender.c,v 1.14 2010/04/01 00:43:29rhaas Exp $
33+
* $PostgreSQL: pgsql/src/backend/replication/walsender.c,v 1.15 2010/04/12 09:52:29heikki Exp $
3434
*
3535
*-------------------------------------------------------------------------
3636
*/
@@ -508,6 +508,10 @@ XLogRead(char *buf, XLogRecPtr recptr, Size nbytes)
508508
{
509509
charpath[MAXPGPATH];
510510
uint32startoff;
511+
uint32lastRemovedLog;
512+
uint32lastRemovedSeg;
513+
uint32log;
514+
uint32seg;
511515

512516
while (nbytes>0)
513517
{
@@ -527,18 +531,35 @@ XLogRead(char *buf, XLogRecPtr recptr, Size nbytes)
527531

528532
sendFile=BasicOpenFile(path,O_RDONLY |PG_BINARY,0);
529533
if (sendFile<0)
530-
ereport(FATAL,/* XXX: Why FATAL? */
531-
(errcode_for_file_access(),
532-
errmsg("could not open file \"%s\" (log file %u, segment %u): %m",
533-
path,sendId,sendSeg)));
534+
{
535+
/*
536+
* If the file is not found, assume it's because the
537+
* standby asked for a too old WAL segment that has already
538+
* been removed or recycled.
539+
*/
540+
if (errno==ENOENT)
541+
{
542+
charfilename[MAXFNAMELEN];
543+
XLogFileName(filename,ThisTimeLineID,sendId,sendSeg);
544+
ereport(ERROR,
545+
(errcode_for_file_access(),
546+
errmsg("requested WAL segment %s has already been removed",
547+
filename)));
548+
}
549+
else
550+
ereport(ERROR,
551+
(errcode_for_file_access(),
552+
errmsg("could not open file \"%s\" (log file %u, segment %u): %m",
553+
path,sendId,sendSeg)));
554+
}
534555
sendOff=0;
535556
}
536557

537558
/* Need to seek in the file? */
538559
if (sendOff!=startoff)
539560
{
540561
if (lseek(sendFile, (off_t)startoff,SEEK_SET)<0)
541-
ereport(FATAL,
562+
ereport(ERROR,
542563
(errcode_for_file_access(),
543564
errmsg("could not seek in log file %u, segment %u to offset %u: %m",
544565
sendId,sendSeg,startoff)));
@@ -553,7 +574,7 @@ XLogRead(char *buf, XLogRecPtr recptr, Size nbytes)
553574

554575
readbytes=read(sendFile,buf,segbytes);
555576
if (readbytes <=0)
556-
ereport(FATAL,
577+
ereport(ERROR,
557578
(errcode_for_file_access(),
558579
errmsg("could not read from log file %u, segment %u, offset %u, "
559580
"length %lu: %m",
@@ -566,6 +587,26 @@ XLogRead(char *buf, XLogRecPtr recptr, Size nbytes)
566587
nbytes-=readbytes;
567588
buf+=readbytes;
568589
}
590+
591+
/*
592+
* After reading into the buffer, check that what we read was valid.
593+
* We do this after reading, because even though the segment was present
594+
* when we opened it, it might get recycled or removed while we read it.
595+
* The read() succeeds in that case, but the data we tried to read might
596+
* already have been overwritten with new WAL records.
597+
*/
598+
XLogGetLastRemoved(&lastRemovedLog,&lastRemovedSeg);
599+
XLByteToPrevSeg(recptr,log,seg);
600+
if (log<lastRemovedLog||
601+
(log==lastRemovedLog&&seg <=lastRemovedSeg))
602+
{
603+
charfilename[MAXFNAMELEN];
604+
XLogFileName(filename,ThisTimeLineID,log,seg);
605+
ereport(ERROR,
606+
(errcode_for_file_access(),
607+
errmsg("requested WAL segment %s has already been removed",
608+
filename)));
609+
}
569610
}
570611

571612
/*
@@ -801,6 +842,12 @@ WalSndShmemInit(void)
801842
}
802843
}
803844

845+
/*
846+
* This isn't currently used for anything. Monitoring tools might be
847+
* interested in the future, and we'll need something like this in the
848+
* future for synchronous replication.
849+
*/
850+
#ifdefNOT_USED
804851
/*
805852
* Returns the oldest Send position among walsenders. Or InvalidXLogRecPtr
806853
* if none.
@@ -834,3 +881,4 @@ GetOldestWALSendPointer(void)
834881
}
835882
returnoldest;
836883
}
884+
#endif

‎src/backend/utils/misc/guc.c

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
* Written by Peter Eisentraut <peter_e@gmx.net>.
1111
*
1212
* IDENTIFICATION
13-
* $PostgreSQL: pgsql/src/backend/utils/misc/guc.c,v 1.546 2010/04/01 00:43:29rhaas Exp $
13+
* $PostgreSQL: pgsql/src/backend/utils/misc/guc.c,v 1.547 2010/04/12 09:52:29heikki Exp $
1414
*
1515
*--------------------------------------------------------------------
1616
*/
@@ -1647,6 +1647,15 @@ static struct config_int ConfigureNamesInt[] =
16471647
0,0,60,NULL,NULL
16481648
},
16491649

1650+
{
1651+
{"standby_keep_segments",PGC_SIGHUP,WAL_CHECKPOINTS,
1652+
gettext_noop("Sets the number of WAL files held for standby servers"),
1653+
NULL
1654+
},
1655+
&StandbySegments,
1656+
0,0,INT_MAX,NULL,NULL
1657+
},
1658+
16501659
{
16511660
{"checkpoint_segments",PGC_SIGHUP,WAL_CHECKPOINTS,
16521661
gettext_noop("Sets the maximum distance in log segments between automatic WAL checkpoints."),

‎src/backend/utils/misc/postgresql.conf.sample

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -193,6 +193,7 @@
193193

194194
#max_wal_senders = 0# max number of walsender processes
195195
#wal_sender_delay = 200ms# 1-10000 milliseconds
196+
#standby_keep_segments = 0# in logfile segments, 16MB each; 0 disables
196197

197198

198199
#------------------------------------------------------------------------------

‎src/include/access/xlog.h

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
* Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
77
* Portions Copyright (c) 1994, Regents of the University of California
88
*
9-
* $PostgreSQL: pgsql/src/include/access/xlog.h,v 1.105 2010/04/01 00:43:29rhaas Exp $
9+
* $PostgreSQL: pgsql/src/include/access/xlog.h,v 1.106 2010/04/12 09:52:29heikki Exp $
1010
*/
1111
#ifndefXLOG_H
1212
#defineXLOG_H
@@ -187,6 +187,7 @@ extern XLogRecPtr XactLastRecEnd;
187187

188188
/* these variables are GUC parameters related to XLOG */
189189
externintCheckPointSegments;
190+
externintStandbySegments;
190191
externintXLOGbuffers;
191192
externboolXLogArchiveMode;
192193
externchar*XLogArchiveCommand;
@@ -267,6 +268,7 @@ extern int XLogFileInit(uint32 log, uint32 seg,
267268
externintXLogFileOpen(uint32log,uint32seg);
268269

269270

271+
externvoidXLogGetLastRemoved(uint32*log,uint32*seg);
270272
externvoidXLogSetAsyncCommitLSN(XLogRecPtrrecord);
271273

272274
externvoidRestoreBkpBlocks(XLogRecPtrlsn,XLogRecord*record,boolcleanup);

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp