Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit06dbd61

Browse files
committed
pgstat: Prevent stats reset from corrupting slotname by removing slotname
Previously PgStat_StatReplSlotEntry contained the slotname, which was mainlyused when writing out the stats during shutdown, to identify the slot in theserialized data (at runtime the index in ReplicationSlotCtl->replication_slotsis used, but that can change during a restart). Unfortunately the slotname wasoverwritten when the slot's stats were reset.That turned out to only cause "real" problems if the slot was active duringthe reset, triggering an assertion failure at the nextpgstat_report_replslot(). In other paths the stats were re-initialized duringpgstat_acquire_replslot().Fix this by removing slotname from PgStat_StatReplSlotEntry. Instead we canget the slot's name from the slot itself. Besides fixing a bug, this also isarchitecturally cleaner (a name is not really statistics). This is safebecause stats, for a slot removed while shut down, will not be restored atstartup.In 15 the slotname is not removed, but renamed, to avoid changing the statsformat. In master, bump PGSTAT_FILE_FORMAT_ID.This commit does not contain a test for the fix. I think this can only betested by a tap test starting pg_recvlogical in the background and checkingpg_recvlogical's output. That type of test is notoriously hard to be reliable,so committing it shortly before the release is wrapped seems like a bad idea.Reported-by: Jaime Casanova <jcasanov@systemguards.com.ec>Author: Andres Freund <andres@anarazel.de>Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>Discussion:https://postgr.es/m/YxfagaTXUNa9ggLb@ahch-toBackpatch: 15-, where the bug was introduced in5891c7a
1 parente4c61be commit06dbd61

File tree

6 files changed

+60
-35
lines changed

6 files changed

+60
-35
lines changed

‎src/backend/replication/slot.c

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -412,6 +412,34 @@ ReplicationSlotIndex(ReplicationSlot *slot)
412412
returnslot-ReplicationSlotCtl->replication_slots;
413413
}
414414

415+
/*
416+
* If the slot at 'index' is unused, return false. Otherwise 'name' is set to
417+
* the slot's name and true is returned.
418+
*
419+
* This likely is only useful for pgstat_replslot.c during shutdown, in other
420+
* cases there are obvious TOCTOU issues.
421+
*/
422+
bool
423+
ReplicationSlotName(intindex,Namename)
424+
{
425+
ReplicationSlot*slot;
426+
boolfound;
427+
428+
slot=&ReplicationSlotCtl->replication_slots[index];
429+
430+
/*
431+
* Ensure that the slot cannot be dropped while we copy the name. Don't
432+
* need the spinlock as the name of an existing slot cannot change.
433+
*/
434+
LWLockAcquire(ReplicationSlotControlLock,LW_SHARED);
435+
found=slot->in_use;
436+
if (slot->in_use)
437+
namestrcpy(name,NameStr(slot->data.name));
438+
LWLockRelease(ReplicationSlotControlLock);
439+
440+
returnfound;
441+
}
442+
415443
/*
416444
* Find a previously created slot and mark it as used by this process.
417445
*

‎src/backend/utils/activity/pgstat.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1367,7 +1367,7 @@ pgstat_write_statsfile(void)
13671367
/* stats entry identified by name on disk (e.g. slots) */
13681368
NameDataname;
13691369

1370-
kind_info->to_serialized_name(shstats,&name);
1370+
kind_info->to_serialized_name(&ps->key,shstats,&name);
13711371

13721372
fputc('N',fpout);
13731373
write_chunk_s(fpout,&ps->key.kind);

‎src/backend/utils/activity/pgstat_replslot.c

Lines changed: 26 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,10 @@ pgstat_reset_replslot(const char *name)
6969

7070
/*
7171
* Report replication slot statistics.
72+
*
73+
* We can rely on the stats for the slot to exist and to belong to this
74+
* slot. We can only get here if pgstat_create_replslot() or
75+
* pgstat_acquire_replslot() have already been called.
7276
*/
7377
void
7478
pgstat_report_replslot(ReplicationSlot*slot,constPgStat_StatReplSlotEntry*repSlotStat)
@@ -82,12 +86,6 @@ pgstat_report_replslot(ReplicationSlot *slot, const PgStat_StatReplSlotEntry *re
8286
shstatent= (PgStatShared_ReplSlot*)entry_ref->shared_stats;
8387
statent=&shstatent->stats;
8488

85-
/*
86-
* Any mismatch should have been fixed in pgstat_create_replslot() or
87-
* pgstat_acquire_replslot().
88-
*/
89-
Assert(namestrcmp(&statent->slotname,NameStr(slot->data.name))==0);
90-
9189
/* Update the replication slot statistics */
9290
#defineREPLSLOT_ACC(fld) statent->fld += repSlotStat->fld
9391
REPLSLOT_ACC(spill_txns);
@@ -124,38 +122,29 @@ pgstat_create_replslot(ReplicationSlot *slot)
124122
* if we previously crashed after dropping a slot.
125123
*/
126124
memset(&shstatent->stats,0,sizeof(shstatent->stats));
127-
namestrcpy(&shstatent->stats.slotname,NameStr(slot->data.name));
128125

129126
pgstat_unlock_entry(entry_ref);
130127
}
131128

132129
/*
133130
* Report replication slot has been acquired.
131+
*
132+
* This guarantees that a stats entry exists during later
133+
* pgstat_report_replslot() calls.
134+
*
135+
* If we previously crashed, no stats data exists. But if we did not crash,
136+
* the stats do belong to this slot:
137+
* - the stats cannot belong to a dropped slot, pgstat_drop_replslot() would
138+
* have been called
139+
* - if the slot was removed while shut down,
140+
* pgstat_replslot_from_serialized_name_cb() returning false would have
141+
* caused the stats to be dropped
134142
*/
135143
void
136144
pgstat_acquire_replslot(ReplicationSlot*slot)
137145
{
138-
PgStat_EntryRef*entry_ref;
139-
PgStatShared_ReplSlot*shstatent;
140-
PgStat_StatReplSlotEntry*statent;
141-
142-
entry_ref=pgstat_get_entry_ref_locked(PGSTAT_KIND_REPLSLOT,InvalidOid,
143-
ReplicationSlotIndex(slot), false);
144-
shstatent= (PgStatShared_ReplSlot*)entry_ref->shared_stats;
145-
statent=&shstatent->stats;
146-
147-
/*
148-
* NB: need to accept that there might be stats from an older slot, e.g.
149-
* if we previously crashed after dropping a slot.
150-
*/
151-
if (NameStr(statent->slotname)[0]==0||
152-
namestrcmp(&statent->slotname,NameStr(slot->data.name))!=0)
153-
{
154-
memset(statent,0,sizeof(*statent));
155-
namestrcpy(&statent->slotname,NameStr(slot->data.name));
156-
}
157-
158-
pgstat_unlock_entry(entry_ref);
146+
pgstat_get_entry_ref(PGSTAT_KIND_REPLSLOT,InvalidOid,
147+
ReplicationSlotIndex(slot), true,NULL);
159148
}
160149

161150
/*
@@ -185,9 +174,16 @@ pgstat_fetch_replslot(NameData slotname)
185174
}
186175

187176
void
188-
pgstat_replslot_to_serialized_name_cb(constPgStatShared_Common*header,NameData*name)
177+
pgstat_replslot_to_serialized_name_cb(constPgStat_HashKey*key,constPgStatShared_Common*header,NameData*name)
189178
{
190-
namestrcpy(name,NameStr(((PgStatShared_ReplSlot*)header)->stats.slotname));
179+
/*
180+
* This is only called late during shutdown. The set of existing slots
181+
* isn't allowed to change at this point, we can assume that a slot exists
182+
* at the offset.
183+
*/
184+
if (!ReplicationSlotName(key->objoid,name))
185+
elog(ERROR,"could not find name for replication slot index %u",
186+
key->objoid);
191187
}
192188

193189
bool

‎src/include/pgstat.h

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -242,7 +242,7 @@ typedef struct PgStat_TableXactStatus
242242
* ------------------------------------------------------------
243243
*/
244244

245-
#definePGSTAT_FILE_FORMAT_ID0x01A5BCA7
245+
#definePGSTAT_FILE_FORMAT_ID0x01A5BCA8
246246

247247
typedefstructPgStat_ArchiverStats
248248
{
@@ -321,7 +321,6 @@ typedef struct PgStat_StatFuncEntry
321321

322322
typedefstructPgStat_StatReplSlotEntry
323323
{
324-
NameDataslotname;
325324
PgStat_Counterspill_txns;
326325
PgStat_Counterspill_count;
327326
PgStat_Counterspill_bytes;

‎src/include/replication/slot.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -218,6 +218,7 @@ extern void ReplicationSlotsDropDBSlots(Oid dboid);
218218
externboolInvalidateObsoleteReplicationSlots(XLogSegNooldestSegno);
219219
externReplicationSlot*SearchNamedReplicationSlot(constchar*name,boolneed_lock);
220220
externintReplicationSlotIndex(ReplicationSlot*slot);
221+
externboolReplicationSlotName(intindex,Namename);
221222
externvoidReplicationSlotNameForTablesync(Oidsuboid,Oidrelid,char*syncslotname,Sizeszslot);
222223
externvoidReplicationSlotDropAtPubNode(WalReceiverConn*wrconn,char*slotname,boolmissing_ok);
223224

‎src/include/utils/pgstat_internal.h

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -242,7 +242,8 @@ typedef struct PgStat_KindInfo
242242
/*
243243
* For variable-numbered stats with named_on_disk. Optional.
244244
*/
245-
void(*to_serialized_name) (constPgStatShared_Common*header,NameData*name);
245+
void(*to_serialized_name) (constPgStat_HashKey*key,
246+
constPgStatShared_Common*header,NameData*name);
246247
bool(*from_serialized_name) (constNameData*name,PgStat_HashKey*key);
247248

248249
/*
@@ -567,7 +568,7 @@ extern void pgstat_relation_delete_pending_cb(PgStat_EntryRef *entry_ref);
567568
*/
568569

569570
externvoidpgstat_replslot_reset_timestamp_cb(PgStatShared_Common*header,TimestampTzts);
570-
externvoidpgstat_replslot_to_serialized_name_cb(constPgStatShared_Common*header,NameData*name);
571+
externvoidpgstat_replslot_to_serialized_name_cb(constPgStat_HashKey*key,constPgStatShared_Common*header,NameData*name);
571572
externboolpgstat_replslot_from_serialized_name_cb(constNameData*name,PgStat_HashKey*key);
572573

573574

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp