Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commited9cc2b

Browse files
committed
Fix bogus concurrent use of _hash_getnewbuf() in bucket split code.
_hash_splitbucket() obtained the base page of the new bucket by calling_hash_getnewbuf(), but it held no exclusive lock that would prevent someother process from calling _hash_getnewbuf() at the same time. This iscontrary to _hash_getnewbuf()'s API spec and could in fact cause failures.In practice, we must only call that function while holding write lock onthe hash index's metapage.An additional problem was that we'd already modified the metapage's bucketmapping data, meaning that failure to extend the index would leave us witha corrupt index.Fix both issues by moving the _hash_getnewbuf() call to just before wemodify the metapage in _hash_expandtable().Unfortunately there's still a large problem here, which is that we couldalso incur ENOSPC while trying to get an overflow page for the new bucket.That would leave the index corrupt in a more subtle way, namely that someindex tuples that should be in the new bucket might still be in the oldone. Fixing that seems substantially more difficult; even preallocating asmany pages as we could possibly need wouldn't entirely guarantee that thebucket split would complete successfully. So for today let's just dealwith the base case.Per report from Antonin Houska. Back-patch to all active branches.
1 parent97690ea commited9cc2b

File tree

1 file changed

+26
-4
lines changed

1 file changed

+26
-4
lines changed

‎src/backend/access/hash/hashpage.c

Lines changed: 26 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@
3737
staticbool_hash_alloc_buckets(Relationrel,BlockNumberfirstblock,
3838
uint32nblocks);
3939
staticvoid_hash_splitbucket(Relationrel,Buffermetabuf,
40+
Buffernbuf,
4041
Bucketobucket,Bucketnbucket,
4142
BlockNumberstart_oblkno,
4243
BlockNumberstart_nblkno,
@@ -176,7 +177,9 @@ _hash_getinitbuf(Relation rel, BlockNumber blkno)
176177
*EOF but before updating the metapage to reflect the added page.)
177178
*
178179
*It is caller's responsibility to ensure that only one process can
179-
*extend the index at a time.
180+
*extend the index at a time. In practice, this function is called
181+
*only while holding write lock on the metapage, because adding a page
182+
*is always associated with an update of metapage data.
180183
*/
181184
Buffer
182185
_hash_getnewbuf(Relationrel,BlockNumberblkno,ForkNumberforkNum)
@@ -503,6 +506,7 @@ _hash_expandtable(Relation rel, Buffer metabuf)
503506
uint32spare_ndx;
504507
BlockNumberstart_oblkno;
505508
BlockNumberstart_nblkno;
509+
Bufferbuf_nblkno;
506510
uint32maxbucket;
507511
uint32highmask;
508512
uint32lowmask;
@@ -603,6 +607,13 @@ _hash_expandtable(Relation rel, Buffer metabuf)
603607
}
604608
}
605609

610+
/*
611+
* Physically allocate the new bucket's primary page. We want to do this
612+
* before changing the metapage's mapping info, in case we can't get the
613+
* disk space.
614+
*/
615+
buf_nblkno=_hash_getnewbuf(rel,start_nblkno,MAIN_FORKNUM);
616+
606617
/*
607618
* Okay to proceed with split. Update the metapage bucket mapping info.
608619
*
@@ -653,7 +664,8 @@ _hash_expandtable(Relation rel, Buffer metabuf)
653664
_hash_chgbufaccess(rel,metabuf,HASH_WRITE,HASH_NOLOCK);
654665

655666
/* Relocate records to the new bucket */
656-
_hash_splitbucket(rel,metabuf,old_bucket,new_bucket,
667+
_hash_splitbucket(rel,metabuf,buf_nblkno,
668+
old_bucket,new_bucket,
657669
start_oblkno,start_nblkno,
658670
maxbucket,highmask,lowmask);
659671

@@ -733,10 +745,16 @@ _hash_alloc_buckets(Relation rel, BlockNumber firstblock, uint32 nblocks)
733745
* The caller must hold a pin, but no lock, on the metapage buffer.
734746
* The buffer is returned in the same state. (The metapage is only
735747
* touched if it becomes necessary to add or remove overflow pages.)
748+
*
749+
* In addition, the caller must have created the new bucket's base page,
750+
* which is passed in buffer nbuf, pinned and write-locked. The lock
751+
* and pin are released here. (The API is set up this way because we must
752+
* do _hash_getnewbuf() before releasing the metapage write lock.)
736753
*/
737754
staticvoid
738755
_hash_splitbucket(Relationrel,
739756
Buffermetabuf,
757+
Buffernbuf,
740758
Bucketobucket,
741759
Bucketnbucket,
742760
BlockNumberstart_oblkno,
@@ -748,7 +766,6 @@ _hash_splitbucket(Relation rel,
748766
BlockNumberoblkno;
749767
BlockNumbernblkno;
750768
Bufferobuf;
751-
Buffernbuf;
752769
Pageopage;
753770
Pagenpage;
754771
HashPageOpaqueoopaque;
@@ -765,7 +782,7 @@ _hash_splitbucket(Relation rel,
765782
oopaque= (HashPageOpaque)PageGetSpecialPointer(opage);
766783

767784
nblkno=start_nblkno;
768-
nbuf=_hash_getnewbuf(rel,nblkno,MAIN_FORKNUM);
785+
Assert(nblkno==BufferGetBlockNumber(nbuf));
769786
npage=BufferGetPage(nbuf);
770787

771788
/* initialize the new bucket's primary page */
@@ -814,6 +831,11 @@ _hash_splitbucket(Relation rel,
814831
* insert the tuple into the new bucket. if it doesn't fit on
815832
* the current page in the new bucket, we must allocate a new
816833
* overflow page and place the tuple on that page instead.
834+
*
835+
* XXX we have a problem here if we fail to get space for a
836+
* new overflow page: we'll error out leaving the bucket split
837+
* only partially complete, meaning the index is corrupt,
838+
* since searches may fail to find entries they should find.
817839
*/
818840
itemsz=IndexTupleDSize(*itup);
819841
itemsz=MAXALIGN(itemsz);

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp