Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit851a26e

Browse files
committed
While vacuuming a large table, update upper-level FSM data every so often.
VACUUM updates leaf-level FSM entries immediately after cleaning thecorresponding heap blocks. fsmpage.c updates the intra-page search treeson the leaf-level FSM pages when this happens, but it does not touch theupper-level FSM pages, so that the released space might not actually befindable by searchers. Previously, updating the upper-level pages happenedonly at the conclusion of the VACUUM run, in a single FreeSpaceMapVacuum()call. This is bad because the VACUUM might get canceled before everreaching that point, so that from the point of view of searchers no spacehas been freed at all, leading to table bloat.We can improve matters by updating the upper pages immediately after eachcycle of index-cleaning and heap-cleaning, processing just the FSM pagescorresponding to the range of heap blocks we have now fully cleaned.This adds a small amount of extra work, since the FSM pages leading downto each range boundary will be touched twice, but it's pretty negligiblecompared to everything else going on in a large VACUUM.If there are no indexes, VACUUM doesn't work in cycles but just cleanseach heap page on first visit. In that case we just arbitrarily updateupper FSM pages after each 8GB of heap. That maintains the goal of notletting all this work slide until the very end, and it doesn't seem worthexpending extra complexity on a case that so seldom occurs in practice.In either case, the FSM is fully up to date before any attempt is madeto truncate the relation, so that the most likely scenario for VACUUMcancellation no longer results in out-of-date upper FSM pages. Whenwe do successfully truncate, adjusting the FSM to reflect that is nowfully handled within FreeSpaceMapTruncateRel.Claudio Freire, reviewed by Masahiko Sawada and Jing Wang, some additionaltweaks by meDiscussion:https://postgr.es/m/CAGTBQpYR0uJCNTt3M5GOzBRHo+-GccNO1nCaQ8yEJmZKSW5q1A@mail.gmail.com
1 parentc0cbe00 commit851a26e

File tree

4 files changed

+141
-23
lines changed

4 files changed

+141
-23
lines changed

‎src/backend/commands/vacuumlazy.c

Lines changed: 39 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,15 @@
8484
#defineVACUUM_TRUNCATE_LOCK_WAIT_INTERVAL50/* ms */
8585
#defineVACUUM_TRUNCATE_LOCK_TIMEOUT5000/* ms */
8686

87+
/*
88+
* When a table has no indexes, vacuum the FSM after every 8GB, approximately
89+
* (it won't be exact because we only vacuum FSM after processing a heap page
90+
* that has some removable tuples). When there are indexes, this is ignored,
91+
* and we vacuum FSM after each index/heap cleaning pass.
92+
*/
93+
#defineVACUUM_FSM_EVERY_PAGES \
94+
((BlockNumber) (((uint64) 8 * 1024 * 1024 * 1024) / BLCKSZ))
95+
8796
/*
8897
* Guesstimation of number of dead tuples per page. This is used to
8998
* provide an upper limit to memory allocated when vacuuming small
@@ -285,9 +294,6 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
285294
pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
286295
PROGRESS_VACUUM_PHASE_FINAL_CLEANUP);
287296

288-
/* Vacuum the Free Space Map */
289-
FreeSpaceMapVacuum(onerel);
290-
291297
/*
292298
* Update statistics in pg_class.
293299
*
@@ -465,7 +471,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
465471
TransactionIdrelfrozenxid=onerel->rd_rel->relfrozenxid;
466472
TransactionIdrelminmxid=onerel->rd_rel->relminmxid;
467473
BlockNumberempty_pages,
468-
vacuumed_pages;
474+
vacuumed_pages,
475+
next_fsm_block_to_vacuum;
469476
doublenum_tuples,/* total number of nonremovable tuples */
470477
live_tuples,/* live tuples (reltuples estimate) */
471478
tups_vacuumed,/* tuples cleaned up by vacuum */
@@ -501,6 +508,7 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
501508
relname)));
502509

503510
empty_pages=vacuumed_pages=0;
511+
next_fsm_block_to_vacuum= (BlockNumber)0;
504512
num_tuples=live_tuples=tups_vacuumed=nkeep=nunused=0;
505513

506514
indstats= (IndexBulkDeleteResult**)
@@ -752,6 +760,13 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
752760
vacrelstats->num_dead_tuples=0;
753761
vacrelstats->num_index_scans++;
754762

763+
/*
764+
* Vacuum the Free Space Map to make newly-freed space visible on
765+
* upper-level FSM pages. Note we have not yet processed blkno.
766+
*/
767+
FreeSpaceMapVacuumRange(onerel,next_fsm_block_to_vacuum,blkno);
768+
next_fsm_block_to_vacuum=blkno;
769+
755770
/* Report that we are once again scanning the heap */
756771
pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
757772
PROGRESS_VACUUM_PHASE_SCAN_HEAP);
@@ -1200,6 +1215,19 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
12001215
*/
12011216
vacrelstats->num_dead_tuples=0;
12021217
vacuumed_pages++;
1218+
1219+
/*
1220+
* Periodically do incremental FSM vacuuming to make newly-freed
1221+
* space visible on upper FSM pages. Note: although we've cleaned
1222+
* the current block, we haven't yet updated its FSM entry (that
1223+
* happens further down), so passing end == blkno is correct.
1224+
*/
1225+
if (blkno-next_fsm_block_to_vacuum >=VACUUM_FSM_EVERY_PAGES)
1226+
{
1227+
FreeSpaceMapVacuumRange(onerel,next_fsm_block_to_vacuum,
1228+
blkno);
1229+
next_fsm_block_to_vacuum=blkno;
1230+
}
12031231
}
12041232

12051233
freespace=PageGetHeapFreeSpace(page);
@@ -1368,6 +1396,13 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
13681396
vacrelstats->num_index_scans++;
13691397
}
13701398

1399+
/*
1400+
* Vacuum the remainder of the Free Space Map. We must do this whether or
1401+
* not there were indexes.
1402+
*/
1403+
if (blkno>next_fsm_block_to_vacuum)
1404+
FreeSpaceMapVacuumRange(onerel,next_fsm_block_to_vacuum,blkno);
1405+
13711406
/* report all blocks vacuumed; and that we're cleaning up */
13721407
pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED,blkno);
13731408
pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,

‎src/backend/storage/freespace/README

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -180,13 +180,13 @@ have a corrupted page, with a parent somewhere with too small a value.
180180
Secondly, if we detect corrupted pages while we search, traversing down
181181
the tree. That check will notice if a parent node is set to too high a value.
182182
In both cases, the upper nodes on the page are immediately rebuilt, fixing
183-
the corruption.
183+
the corruption so far as that page is concerned.
184184

185-
Vacuum updates all the bottomlevel pages with correct amount of free space
186-
onthe heap pages,fixing any outdated values there. Afterthe heap and
187-
index passes are done, FreeSpaceMapVacuum is called, and the FSM tree is
188-
scanned in depth-first order. This fixes any discrepancies between upper
189-
and lower level FSMpages.
185+
VACUUM updates all the bottom-levelFSMpages withthecorrect amount of free
186+
spaceoncorresponding heap pages,as it proceeds throughthe heap. This
187+
goes through fsm_set_avail(), so that the upper nodes on those pages are
188+
immediately updated. Periodically, VACUUM calls FreeSpaceMapVacuum[Range]
189+
to propagate the new free-space info into the upperpages of the FSM tree.
190190

191191
TODO
192192
----

‎src/backend/storage/freespace/freespace.c

Lines changed: 94 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -108,7 +108,9 @@ static Size fsm_space_cat_to_avail(uint8 cat);
108108
staticintfsm_set_and_search(Relationrel,FSMAddressaddr,uint16slot,
109109
uint8newValue,uint8minValue);
110110
staticBlockNumberfsm_search(Relationrel,uint8min_cat);
111-
staticuint8fsm_vacuum_page(Relationrel,FSMAddressaddr,bool*eof);
111+
staticuint8fsm_vacuum_page(Relationrel,FSMAddressaddr,
112+
BlockNumberstart,BlockNumberend,
113+
bool*eof);
112114
staticBlockNumberfsm_get_lastblckno(Relationrel,FSMAddressaddr);
113115
staticvoidfsm_update_recursive(Relationrel,FSMAddressaddr,uint8new_cat);
114116

@@ -370,21 +372,48 @@ FreeSpaceMapTruncateRel(Relation rel, BlockNumber nblocks)
370372
*/
371373
if (rel->rd_smgr)
372374
rel->rd_smgr->smgr_fsm_nblocks=new_nfsmblocks;
375+
376+
/*
377+
* Update upper-level FSM pages to account for the truncation. This is
378+
* important because the just-truncated pages were likely marked as
379+
* all-free, and would be preferentially selected.
380+
*/
381+
FreeSpaceMapVacuumRange(rel,nblocks,InvalidBlockNumber);
373382
}
374383

375384
/*
376-
* FreeSpaceMapVacuum - scan and fix any inconsistencies in the FSM
385+
* FreeSpaceMapVacuum - update upper-level pages in the rel's FSM
386+
*
387+
* We assume that the bottom-level pages have already been updated with
388+
* new free-space information.
377389
*/
378390
void
379391
FreeSpaceMapVacuum(Relationrel)
380392
{
381393
booldummy;
382394

383-
/*
384-
* Traverse the tree in depth-first order. The tree is stored physically
385-
* in depth-first order, so this should be pretty I/O efficient.
386-
*/
387-
fsm_vacuum_page(rel,FSM_ROOT_ADDRESS,&dummy);
395+
/* Recursively scan the tree, starting at the root */
396+
(void)fsm_vacuum_page(rel,FSM_ROOT_ADDRESS,
397+
(BlockNumber)0,InvalidBlockNumber,
398+
&dummy);
399+
}
400+
401+
/*
402+
* FreeSpaceMapVacuumRange - update upper-level pages in the rel's FSM
403+
*
404+
* As above, but assume that only heap pages between start and end-1 inclusive
405+
* have new free-space information, so update only the upper-level slots
406+
* covering that block range. end == InvalidBlockNumber is equivalent to
407+
* "all the rest of the relation".
408+
*/
409+
void
410+
FreeSpaceMapVacuumRange(Relationrel,BlockNumberstart,BlockNumberend)
411+
{
412+
booldummy;
413+
414+
/* Recursively scan the tree, starting at the root */
415+
if (end>start)
416+
(void)fsm_vacuum_page(rel,FSM_ROOT_ADDRESS,start,end,&dummy);
388417
}
389418

390419
/******** Internal routines ********/
@@ -783,9 +812,21 @@ fsm_search(Relation rel, uint8 min_cat)
783812

784813
/*
785814
* Recursive guts of FreeSpaceMapVacuum
815+
*
816+
* Examine the FSM page indicated by addr, as well as its children, updating
817+
* upper-level nodes that cover the heap block range from start to end-1.
818+
* (It's okay if end is beyond the actual end of the map.)
819+
* Return the maximum freespace value on this page.
820+
*
821+
* If addr is past the end of the FSM, set *eof_p to true and return 0.
822+
*
823+
* This traverses the tree in depth-first order. The tree is stored
824+
* physically in depth-first order, so this should be pretty I/O efficient.
786825
*/
787826
staticuint8
788-
fsm_vacuum_page(Relationrel,FSMAddressaddr,bool*eof_p)
827+
fsm_vacuum_page(Relationrel,FSMAddressaddr,
828+
BlockNumberstart,BlockNumberend,
829+
bool*eof_p)
789830
{
790831
Bufferbuf;
791832
Pagepage;
@@ -804,23 +845,62 @@ fsm_vacuum_page(Relation rel, FSMAddress addr, bool *eof_p)
804845
page=BufferGetPage(buf);
805846

806847
/*
807-
*Recurseinto children, and fix the information stored about them at
808-
* this level.
848+
*If we're above the bottom level, recurseinto children, and fix the
849+
*information stored about them atthis level.
809850
*/
810851
if (addr.level>FSM_BOTTOM_LEVEL)
811852
{
812-
intslot;
853+
FSMAddressfsm_start,
854+
fsm_end;
855+
uint16fsm_start_slot,
856+
fsm_end_slot;
857+
intslot,
858+
start_slot,
859+
end_slot;
813860
booleof= false;
814861

815-
for (slot=0;slot<SlotsPerFSMPage;slot++)
862+
/*
863+
* Compute the range of slots we need to update on this page, given
864+
* the requested range of heap blocks to consider. The first slot to
865+
* update is the one covering the "start" block, and the last slot is
866+
* the one covering "end - 1". (Some of this work will be duplicated
867+
* in each recursive call, but it's cheap enough to not worry about.)
868+
*/
869+
fsm_start=fsm_get_location(start,&fsm_start_slot);
870+
fsm_end=fsm_get_location(end-1,&fsm_end_slot);
871+
872+
while (fsm_start.level<addr.level)
873+
{
874+
fsm_start=fsm_get_parent(fsm_start,&fsm_start_slot);
875+
fsm_end=fsm_get_parent(fsm_end,&fsm_end_slot);
876+
}
877+
Assert(fsm_start.level==addr.level);
878+
879+
if (fsm_start.logpageno==addr.logpageno)
880+
start_slot=fsm_start_slot;
881+
elseif (fsm_start.logpageno>addr.logpageno)
882+
start_slot=SlotsPerFSMPage;/* shouldn't get here... */
883+
else
884+
start_slot=0;
885+
886+
if (fsm_end.logpageno==addr.logpageno)
887+
end_slot=fsm_end_slot;
888+
elseif (fsm_end.logpageno>addr.logpageno)
889+
end_slot=SlotsPerFSMPage-1;
890+
else
891+
end_slot=-1;/* shouldn't get here... */
892+
893+
for (slot=start_slot;slot <=end_slot;slot++)
816894
{
817895
intchild_avail;
818896

819897
CHECK_FOR_INTERRUPTS();
820898

821899
/* After we hit end-of-file, just clear the rest of the slots */
822900
if (!eof)
823-
child_avail=fsm_vacuum_page(rel,fsm_get_child(addr,slot),&eof);
901+
child_avail=fsm_vacuum_page(rel,fsm_get_child(addr,slot),
902+
start,end,
903+
&eof);
824904
else
825905
child_avail=0;
826906

@@ -835,6 +915,7 @@ fsm_vacuum_page(Relation rel, FSMAddress addr, bool *eof_p)
835915
}
836916
}
837917

918+
/* Now get the maximum value on the page, to return to caller */
838919
max_avail=fsm_get_max_avail(BufferGetPage(buf));
839920

840921
/*

‎src/include/storage/freespace.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,8 @@ extern void XLogRecordPageWithFreeSpace(RelFileNode rnode, BlockNumber heapBlk,
3232

3333
externvoidFreeSpaceMapTruncateRel(Relationrel,BlockNumbernblocks);
3434
externvoidFreeSpaceMapVacuum(Relationrel);
35+
externvoidFreeSpaceMapVacuumRange(Relationrel,BlockNumberstart,
36+
BlockNumberend);
3537
externvoidUpdateFreeSpaceMap(Relationrel,
3638
BlockNumberstartBlkNum,
3739
BlockNumberendBlkNum,

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp