Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitc655077

Browse files
committed
Allow users to limit storage reserved by replication slots
Replication slots are useful to retain data that may be needed by areplication system. But experience has shown that allowing them toretain excessive data can lead to the primary failing because of runningout of space. This new feature allows the user to configure a maximumamount of space to be reserved using the new optionmax_slot_wal_keep_size. Slots that overrun that space are invalidatedat checkpoint time, enabling the storage to be released.Author: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>Reviewed-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>Discussion:https://postgr.es/m/20170228.122736.123383594.horiguchi.kyotaro@lab.ntt.co.jp
1 parentb63c293 commitc655077

File tree

17 files changed

+595
-43
lines changed

17 files changed

+595
-43
lines changed

‎doc/src/sgml/catalogs.sgml

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9907,6 +9907,44 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
99079907
</entry>
99089908
</row>
99099909

9910+
<row>
9911+
<entry><structfield>wal_status</structfield></entry>
9912+
<entry><type>text</type></entry>
9913+
<entry></entry>
9914+
9915+
<entry>Availability of WAL files claimed by this slot.
9916+
Possible values are:
9917+
<simplelist>
9918+
<member>
9919+
<literal>normal</literal> means that the claimed files
9920+
are within <varname>max_wal_size</varname>
9921+
</member>
9922+
<member>
9923+
<literal>reserved</literal> means that <varname>max_wal_size</varname>
9924+
is exceeded but the files are still held, either by some replication
9925+
slot or by <varname>wal_keep_segments</varname>
9926+
</member>
9927+
<member>
9928+
<literal>lost</literal> means that some WAL files are definitely lost
9929+
and this slot cannot be used to resume replication anymore.
9930+
</member>
9931+
</simplelist>
9932+
The last two states are seen only when
9933+
<xref linkend="guc-max-slot-wal-keep-size"/> is
9934+
non-negative. If <structfield>restart_lsn</structfield> is NULL, this
9935+
field is null.
9936+
</entry>
9937+
</row>
9938+
9939+
<row>
9940+
<entry><structfield>min_safe_lsn</structfield></entry>
9941+
<entry><type>pg_lsn</type></entry>
9942+
<entry></entry>
9943+
<entry>
9944+
The minimum LSN currently available for walsenders.
9945+
</entry>
9946+
</row>
9947+
99109948
</tbody>
99119949
</tgroup>
99129950
</table>

‎doc/src/sgml/config.sgml

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3777,6 +3777,29 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
37773777
</listitem>
37783778
</varlistentry>
37793779

3780+
<varlistentry id="guc-max-slot-wal-keep-size" xreflabel="max_slot_wal_keep_size">
3781+
<term><varname>max_slot_wal_keep_size</varname> (<type>integer</type>)
3782+
<indexterm>
3783+
<primary><varname>max_slot_wal_keep_size</varname> configuration parameter</primary>
3784+
</indexterm>
3785+
</term>
3786+
<listitem>
3787+
<para>
3788+
Specify the maximum size of WAL files
3789+
that <link linkend="streaming-replication-slots">replication
3790+
slots</link> are allowed to retain in the <filename>pg_wal</filename>
3791+
directory at checkpoint time.
3792+
If <varname>max_slot_wal_keep_size</varname> is -1 (the default),
3793+
replication slots retain unlimited amount of WAL files. If
3794+
restart_lsn of a replication slot gets behind more than that megabytes
3795+
from the current LSN, the standby using the slot may no longer be able
3796+
to continue replication due to removal of required WAL files. You
3797+
can see the WAL availability of replication slots
3798+
in <link linkend="view-pg-replication-slots">pg_replication_slots</link>.
3799+
</para>
3800+
</listitem>
3801+
</varlistentry>
3802+
37803803
<varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout">
37813804
<term><varname>wal_sender_timeout</varname> (<type>integer</type>)
37823805
<indexterm>

‎doc/src/sgml/high-availability.sgml

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -925,9 +925,11 @@ primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass'
925925
<xref linkend="guc-archive-command"/>.
926926
However, these methods often result in retaining more WAL segments than
927927
required, whereas replication slots retain only the number of segments
928-
known to be needed. An advantage of these methods is that they bound
929-
the space requirement for <literal>pg_wal</literal>; there is currently no way
930-
to do this using replication slots.
928+
known to be needed. On the other hand, replication slots can retain so
929+
many WAL segments that they fill up the space allocated
930+
for <literal>pg_wal</literal>;
931+
<xref linkend="guc-max-slot-wal-keep-size"/> limits the size of WAL files
932+
retained by replication slots.
931933
</para>
932934
<para>
933935
Similarly, <xref linkend="guc-hot-standby-feedback"/>

‎src/backend/access/transam/xlog.c

Lines changed: 122 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,7 @@ intwal_level = WAL_LEVEL_MINIMAL;
108108
intCommitDelay=0;/* precommit delay in microseconds */
109109
intCommitSiblings=5;/* # concurrent xacts needed to sleep */
110110
intwal_retrieve_retry_interval=5000;
111+
intmax_slot_wal_keep_size_mb=-1;
111112

112113
#ifdefWAL_DEBUG
113114
boolXLOG_DEBUG= false;
@@ -759,7 +760,7 @@ static ControlFileData *ControlFile = NULL;
759760
*/
760761
#defineUsableBytesInPage (XLOG_BLCKSZ - SizeOfXLogShortPHD)
761762

762-
/* Convertmin_wal_size_mb and max_wal_size_mbtoequivalent segment count */
763+
/* Convertvalues of GUCs measured in megabytestoequiv. segment count */
763764
#defineConvertToXSegs(x,segsize)\
764765
(x / ((segsize) / (1024 * 1024)))
765766

@@ -3963,9 +3964,10 @@ XLogGetLastRemovedSegno(void)
39633964
returnlastRemovedSegNo;
39643965
}
39653966

3967+
39663968
/*
3967-
* Update the last removed segno pointer in shared memory, to reflect
3968-
*that thegiven XLOG file has been removed.
3969+
* Update the last removed segno pointer in shared memory, to reflect that the
3970+
* given XLOG file has been removed.
39693971
*/
39703972
staticvoid
39713973
UpdateLastRemovedPtr(char*filename)
@@ -9043,6 +9045,7 @@ CreateCheckPoint(int flags)
90439045
*/
90449046
XLByteToSeg(RedoRecPtr,_logSegNo,wal_segment_size);
90459047
KeepLogSeg(recptr,&_logSegNo);
9048+
InvalidateObsoleteReplicationSlots(_logSegNo);
90469049
_logSegNo--;
90479050
RemoveOldXlogFiles(_logSegNo,RedoRecPtr,recptr);
90489051

@@ -9377,6 +9380,7 @@ CreateRestartPoint(int flags)
93779380
replayPtr=GetXLogReplayRecPtr(&replayTLI);
93789381
endptr= (receivePtr<replayPtr) ?replayPtr :receivePtr;
93799382
KeepLogSeg(endptr,&_logSegNo);
9383+
InvalidateObsoleteReplicationSlots(_logSegNo);
93809384
_logSegNo--;
93819385

93829386
/*
@@ -9445,48 +9449,143 @@ CreateRestartPoint(int flags)
94459449
return true;
94469450
}
94479451

9452+
/*
9453+
* Report availability of WAL for the given target LSN
9454+
*(typically a slot's restart_lsn)
9455+
*
9456+
* Returns one of the following enum values:
9457+
* * WALAVAIL_NORMAL means targetLSN is available because it is in the range
9458+
* of max_wal_size.
9459+
*
9460+
* * WALAVAIL_PRESERVED means it is still available by preserving extra
9461+
* segments beyond max_wal_size. If max_slot_wal_keep_size is smaller
9462+
* than max_wal_size, this state is not returned.
9463+
*
9464+
* * WALAVAIL_REMOVED means it is definitely lost. A replication stream on
9465+
* a slot with this LSN cannot continue.
9466+
*
9467+
* * WALAVAIL_INVALID_LSN means the slot hasn't been set to reserve WAL.
9468+
*/
9469+
WALAvailability
9470+
GetWALAvailability(XLogRecPtrtargetLSN)
9471+
{
9472+
XLogRecPtrcurrpos;/* current write LSN */
9473+
XLogSegNocurrSeg;/* segid of currpos */
9474+
XLogSegNotargetSeg;/* segid of targetLSN */
9475+
XLogSegNooldestSeg;/* actual oldest segid */
9476+
XLogSegNooldestSegMaxWalSize;/* oldest segid kept by max_wal_size */
9477+
XLogSegNooldestSlotSeg=InvalidXLogRecPtr;/* oldest segid kept by
9478+
* slot */
9479+
uint64keepSegs;
9480+
9481+
/* slot does not reserve WAL. Either deactivated, or has never been active */
9482+
if (XLogRecPtrIsInvalid(targetLSN))
9483+
returnWALAVAIL_INVALID_LSN;
9484+
9485+
currpos=GetXLogWriteRecPtr();
9486+
9487+
/* calculate oldest segment currently needed by slots */
9488+
XLByteToSeg(targetLSN,targetSeg,wal_segment_size);
9489+
KeepLogSeg(currpos,&oldestSlotSeg);
9490+
9491+
/*
9492+
* Find the oldest extant segment file. We get 1 until checkpoint removes
9493+
* the first WAL segment file since startup, which causes the status being
9494+
* wrong under certain abnormal conditions but that doesn't actually harm.
9495+
*/
9496+
oldestSeg=XLogGetLastRemovedSegno()+1;
9497+
9498+
/* calculate oldest segment by max_wal_size and wal_keep_segments */
9499+
XLByteToSeg(currpos,currSeg,wal_segment_size);
9500+
keepSegs=ConvertToXSegs(Max(max_wal_size_mb,wal_keep_segments),
9501+
wal_segment_size)+1;
9502+
9503+
if (currSeg>keepSegs)
9504+
oldestSegMaxWalSize=currSeg-keepSegs;
9505+
else
9506+
oldestSegMaxWalSize=1;
9507+
9508+
/*
9509+
* If max_slot_wal_keep_size has changed after the last call, the segment
9510+
* that would been kept by the current setting might have been lost by the
9511+
* previous setting. No point in showing normal or keeping status values
9512+
* if the targetSeg is known to be lost.
9513+
*/
9514+
if (targetSeg >=oldestSeg)
9515+
{
9516+
/*
9517+
* show "normal" when targetSeg is within max_wal_size, even if
9518+
* max_slot_wal_keep_size is smaller than max_wal_size.
9519+
*/
9520+
if ((max_slot_wal_keep_size_mb <=0||
9521+
max_slot_wal_keep_size_mb >=max_wal_size_mb)&&
9522+
oldestSegMaxWalSize <=targetSeg)
9523+
returnWALAVAIL_NORMAL;
9524+
9525+
/* being retained by slots */
9526+
if (oldestSlotSeg <=targetSeg)
9527+
returnWALAVAIL_RESERVED;
9528+
}
9529+
9530+
/* Definitely lost */
9531+
returnWALAVAIL_REMOVED;
9532+
}
9533+
9534+
94489535
/*
94499536
* Retreat *logSegNo to the last segment that we need to retain because of
94509537
* either wal_keep_segments or replication slots.
94519538
*
94529539
* This is calculated by subtracting wal_keep_segments from the given xlog
94539540
* location, recptr and by making sure that that result is below the
9454-
* requirement of replication slots.
9541+
* requirement of replication slots. For the latter criterion we do consider
9542+
* the effects of max_slot_wal_keep_size: reserve at most that much space back
9543+
* from recptr.
94559544
*/
94569545
staticvoid
94579546
KeepLogSeg(XLogRecPtrrecptr,XLogSegNo*logSegNo)
94589547
{
9548+
XLogSegNocurrSegNo;
94599549
XLogSegNosegno;
94609550
XLogRecPtrkeep;
94619551

9462-
XLByteToSeg(recptr,segno,wal_segment_size);
9463-
keep=XLogGetReplicationSlotMinimumLSN();
9552+
XLByteToSeg(recptr,currSegNo,wal_segment_size);
9553+
segno=currSegNo;
94649554

9465-
/* compute limit for wal_keep_segments first */
9466-
if (wal_keep_segments>0)
9555+
/*
9556+
* Calculate how many segments are kept by slots first, adjusting for
9557+
* max_slot_wal_keep_size.
9558+
*/
9559+
keep=XLogGetReplicationSlotMinimumLSN();
9560+
if (keep!=InvalidXLogRecPtr)
94679561
{
9468-
/* avoid underflow, don't go below 1 */
9469-
if (segno <=wal_keep_segments)
9470-
segno=1;
9471-
else
9472-
segno=segno-wal_keep_segments;
9473-
}
9562+
XLByteToSeg(keep,segno,wal_segment_size);
94749563

9475-
/*then check whether slots limit removal further */
9476-
if (max_replication_slots>0&&keep!=InvalidXLogRecPtr)
9477-
{
9478-
XLogSegNoslotSegNo;
9564+
/*Cap by max_slot_wal_keep_size ... */
9565+
if (max_slot_wal_keep_size_mb >=0)
9566+
{
9567+
XLogRecPtrslot_keep_segs;
94799568

9480-
XLByteToSeg(keep,slotSegNo,wal_segment_size);
9569+
slot_keep_segs=
9570+
ConvertToXSegs(max_slot_wal_keep_size_mb,wal_segment_size);
94819571

9482-
if (slotSegNo <=0)
9572+
if (currSegNo-segno>slot_keep_segs)
9573+
segno=currSegNo-slot_keep_segs;
9574+
}
9575+
}
9576+
9577+
/* but, keep at least wal_keep_segments if that's set */
9578+
if (wal_keep_segments>0&&currSegNo-segno<wal_keep_segments)
9579+
{
9580+
/* avoid underflow, don't go below 1 */
9581+
if (currSegNo <=wal_keep_segments)
94839582
segno=1;
9484-
elseif (slotSegNo<segno)
9485-
segno=slotSegNo;
9583+
else
9584+
segno=currSegNo-wal_keep_segments;
94869585
}
94879586

94889587
/* don't delete WAL segments newer than the calculated segment */
9489-
if (segno<*logSegNo)
9588+
if (XLogRecPtrIsInvalid(*logSegNo)||segno<*logSegNo)
94909589
*logSegNo=segno;
94919590
}
94929591

‎src/backend/catalog/system_views.sql

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -876,7 +876,9 @@ CREATE VIEW pg_replication_slots AS
876876
L.xmin,
877877
L.catalog_xmin,
878878
L.restart_lsn,
879-
L.confirmed_flush_lsn
879+
L.confirmed_flush_lsn,
880+
L.wal_status,
881+
L.min_safe_lsn
880882
FROM pg_get_replication_slots()AS L
881883
LEFT JOIN pg_database DON (L.datoid=D.oid);
882884

‎src/backend/replication/logical/logicalfuncs.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -225,7 +225,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
225225
else
226226
end_of_wal=GetXLogReplayRecPtr(&ThisTimeLineID);
227227

228-
ReplicationSlotAcquire(NameStr(*name),true);
228+
(void)ReplicationSlotAcquire(NameStr(*name),SAB_Error);
229229

230230
PG_TRY();
231231
{

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp