Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit2e3bd06

Browse files
committed
Fix deadlock in GIN vacuum introduced by218f515
Before218f515 if posting tree page is about to be deleted, then the wholeposting tree is locked by LockBufferForCleanup() on root preventing all theconcurrent inserts.218f515 reduced locking to the subtree containingpage to be deleted. However, due to concurrent parent split, inserter doesn'talways holds pins on all the pages constituting path from root to the targetleaf page. That could cause a deadlock between GIN vacuum process and GINinserter. And we didn't find non-invasive way to fix this.This commit reverts VACUUM behavior to lock the whole posting tree beforedelete any page. However, we keep another useful change by218f515: thetree is locked only if there are pages to be deleted.Reported-by: Chen HuajunDiagnosed-by: Chen Huajun, Andrey Borodin, Peter GeogheganDiscussion:https://postgr.es/m/31a702a.14dd.166c1366ac1.Coremail.chjischj%40163.comAuthor: Alexander Korotkov, based on ideas from Andrey Borodin and Peter GeogheganReviewed-by: Andrey BorodinBackpatch-through: 10
1 parent8a7f24b commit2e3bd06

File tree

2 files changed

+73
-90
lines changed

2 files changed

+73
-90
lines changed

‎src/backend/access/gin/README

Lines changed: 4 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -314,17 +314,10 @@ deleted.
314314
The previous paragraph's reasoning only applies to searches, and only to
315315
posting trees. To protect from inserters following a downlink to a deleted
316316
page, vacuum simply locks out all concurrent insertions to the posting tree,
317-
by holding a super-exclusive lock on the parent page of subtree with deletable
318-
pages. Inserters hold a pin on the root page, but searches do not, so while
319-
new searches cannot begin while root page is locked, any already-in-progress
320-
scans can continue concurrently with vacuum in corresponding subtree of
321-
posting tree. To exclude interference with readers vacuum takes exclusive
322-
locks in a depth-first scan in left-to-right order of page tuples. Leftmost
323-
page is never deleted. Thus before deleting any page we obtain exclusive
324-
lock on any left page, effectively excluding deadlock with any reader, despite
325-
taking parent lock before current and left lock after current. We take left
326-
lock not for a concurrency reasons, but rather in need to mark page dirty.
327-
In the entry tree, we never delete pages.
317+
by holding a super-exclusive lock on the posting tree root. Inserters hold a
318+
pin on the root page, but searches do not, so while new searches cannot begin
319+
while root page is locked, any already-in-progress scans can continue
320+
concurrently with vacuum. In the entry tree, we never delete pages.
328321

329322
(This is quite different from the mechanism the btree indexam uses to make
330323
page-deletions safe; it stamps the deleted pages with an XID and keeps the

‎src/backend/access/gin/ginvacuum.c

Lines changed: 69 additions & 79 deletions
Original file line numberDiff line numberDiff line change
@@ -308,124 +308,114 @@ ginScanToDelete(GinVacuumState *gvs, BlockNumber blkno, bool isRoot,
308308

309309

310310
/*
311-
* Scan through posting tree, delete empty tuples from leaf pages.
312-
* Also, this function collects empty subtrees (with all empty leafs).
313-
* For parents of these subtrees CleanUp lock is taken, then we call
314-
* ScanToDelete. This is done for every inner page, which points to
315-
* empty subtree.
311+
* Scan through posting tree leafs, delete empty tuples. Returns true if there
312+
* is at least one empty page.
316313
*/
317314
staticbool
318-
ginVacuumPostingTreeLeaves(GinVacuumState*gvs,BlockNumberblkno,boolisRoot)
315+
ginVacuumPostingTreeLeaves(GinVacuumState*gvs,BlockNumberblkno)
319316
{
320317
Bufferbuffer;
321318
Pagepage;
322319
boolhasVoidPage= FALSE;
323320
MemoryContextoldCxt;
324321

325-
buffer=ReadBufferExtended(gvs->index,MAIN_FORKNUM,blkno,
326-
RBM_NORMAL,gvs->strategy);
327-
page=BufferGetPage(buffer);
322+
/* Find leftmost leaf page of posting tree and lock it in exclusive mode */
323+
while (true)
324+
{
325+
PostingItem*pitem;
328326

329-
ginTraverseLock(buffer, false);
327+
buffer=ReadBufferExtended(gvs->index,MAIN_FORKNUM,blkno,
328+
RBM_NORMAL,gvs->strategy);
329+
LockBuffer(buffer,GIN_SHARE);
330+
page=BufferGetPage(buffer);
330331

331-
Assert(GinPageIsData(page));
332+
Assert(GinPageIsData(page));
332333

333-
if (GinPageIsLeaf(page))
334+
if (GinPageIsLeaf(page))
335+
{
336+
LockBuffer(buffer,GIN_UNLOCK);
337+
LockBuffer(buffer,GIN_EXCLUSIVE);
338+
break;
339+
}
340+
341+
Assert(PageGetMaxOffsetNumber(page) >=FirstOffsetNumber);
342+
343+
pitem=GinDataPageGetPostingItem(page,FirstOffsetNumber);
344+
blkno=PostingItemGetBlockNumber(pitem);
345+
Assert(blkno!=InvalidBlockNumber);
346+
347+
UnlockReleaseBuffer(buffer);
348+
}
349+
350+
/* Iterate all posting tree leaves using rightlinks and vacuum them */
351+
while (true)
334352
{
335353
oldCxt=MemoryContextSwitchTo(gvs->tmpCxt);
336354
ginVacuumPostingTreeLeaf(gvs->index,buffer,gvs);
337355
MemoryContextSwitchTo(oldCxt);
338356
MemoryContextReset(gvs->tmpCxt);
339357

340-
/* if root is a leaf page, we don't desire further processing */
341358
if (GinDataLeafPageIsEmpty(page))
342359
hasVoidPage= TRUE;
343360

344-
UnlockReleaseBuffer(buffer);
345-
346-
returnhasVoidPage;
347-
}
348-
else
349-
{
350-
OffsetNumberi;
351-
boolhasEmptyChild= FALSE;
352-
boolhasNonEmptyChild= FALSE;
353-
OffsetNumbermaxoff=GinPageGetOpaque(page)->maxoff;
354-
BlockNumber*children=palloc(sizeof(BlockNumber)* (maxoff+1));
355-
356-
/*
357-
* Read all children BlockNumbers. Not sure it is safe if there are
358-
* many concurrent vacuums.
359-
*/
360-
361-
for (i=FirstOffsetNumber;i <=maxoff;i++)
362-
{
363-
PostingItem*pitem=GinDataPageGetPostingItem(page,i);
364-
365-
children[i]=PostingItemGetBlockNumber(pitem);
366-
}
361+
blkno=GinPageGetOpaque(page)->rightlink;
367362

368363
UnlockReleaseBuffer(buffer);
369364

370-
for (i=FirstOffsetNumber;i <=maxoff;i++)
371-
{
372-
if (ginVacuumPostingTreeLeaves(gvs,children[i], FALSE))
373-
hasEmptyChild= TRUE;
374-
else
375-
hasNonEmptyChild= TRUE;
376-
}
365+
if (blkno==InvalidBlockNumber)
366+
break;
377367

378-
pfree(children);
368+
buffer=ReadBufferExtended(gvs->index,MAIN_FORKNUM,blkno,
369+
RBM_NORMAL,gvs->strategy);
370+
LockBuffer(buffer,GIN_EXCLUSIVE);
371+
page=BufferGetPage(buffer);
372+
}
379373

380-
vacuum_delay_point();
374+
returnhasVoidPage;
375+
}
381376

377+
staticvoid
378+
ginVacuumPostingTree(GinVacuumState*gvs,BlockNumberrootBlkno)
379+
{
380+
if (ginVacuumPostingTreeLeaves(gvs,rootBlkno))
381+
{
382382
/*
383-
*All subtreeis empty- just return TRUEtoindicate that parent
384-
*must do a cleanup. Unless we are ROOT an there is way to go upper.
383+
*Thereisat least oneemptypage. So we havetorescan the tree
384+
*deleting empty pages.
385385
*/
386+
Bufferbuffer;
387+
DataPageDeleteStackroot,
388+
*ptr,
389+
*tmp;
386390

387-
if (hasEmptyChild&& !hasNonEmptyChild&& !isRoot)
388-
return TRUE;
391+
buffer=ReadBufferExtended(gvs->index,MAIN_FORKNUM,rootBlkno,
392+
RBM_NORMAL,gvs->strategy);
389393

390-
if (hasEmptyChild)
391-
{
392-
DataPageDeleteStackroot,
393-
*ptr,
394-
*tmp;
395-
396-
buffer=ReadBufferExtended(gvs->index,MAIN_FORKNUM,blkno,
397-
RBM_NORMAL,gvs->strategy);
398-
LockBufferForCleanup(buffer);
399-
400-
memset(&root,0,sizeof(DataPageDeleteStack));
401-
root.leftBlkno=InvalidBlockNumber;
402-
root.isRoot= TRUE;
394+
/*
395+
* Lock posting tree root for cleanup to ensure there are no concurrent
396+
* inserts.
397+
*/
398+
LockBufferForCleanup(buffer);
403399

404-
ginScanToDelete(gvs,blkno, TRUE,&root,InvalidOffsetNumber);
400+
memset(&root,0,sizeof(DataPageDeleteStack));
401+
root.leftBlkno=InvalidBlockNumber;
402+
root.isRoot= true;
405403

406-
ptr=root.child;
404+
ginScanToDelete(gvs,rootBlkno, true,&root,InvalidOffsetNumber);
407405

408-
while (ptr)
409-
{
410-
tmp=ptr->child;
411-
pfree(ptr);
412-
ptr=tmp;
413-
}
406+
ptr=root.child;
414407

415-
UnlockReleaseBuffer(buffer);
408+
while (ptr)
409+
{
410+
tmp=ptr->child;
411+
pfree(ptr);
412+
ptr=tmp;
416413
}
417414

418-
/* Here we have deleted all empty subtrees */
419-
return FALSE;
415+
UnlockReleaseBuffer(buffer);
420416
}
421417
}
422418

423-
staticvoid
424-
ginVacuumPostingTree(GinVacuumState*gvs,BlockNumberrootBlkno)
425-
{
426-
ginVacuumPostingTreeLeaves(gvs,rootBlkno, TRUE);
427-
}
428-
429419
/*
430420
* returns modified page or NULL if page isn't modified.
431421
* Function works with original page until first change is occurred,

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp