Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit9bacec1

Browse files
Don't overlook indexes during parallel VACUUM.
Commitb4af70c, which simplified state managed by VACUUM, performedrefactoring of parallel VACUUM in passing. Confusion about the exactdetails of the tasks that the leader process is responsible for led tocode that made it possible for parallel VACUUM to miss a subset of thetable's indexes entirely. Specifically, indexes that fell under themin_parallel_index_scan_size size cutoff were missed. These indexes aresupposed to be vacuumed by the leader (alongside any parallel unsafeindexes), but weren't vacuumed at all. Affected indexes could easilyend up with duplicate heap TIDs, once heap TIDs were recycled for newheap tuples. This had generic symptoms that might be seen with almostany index corruption involving structural inconsistencies between anindex and its table.To fix, make sure that the parallel VACUUM leader process performs anyrequired index vacuuming for indexes that happen to be below the sizecutoff. Also document the design of parallel VACUUM with thesebelow-size-cutoff indexes.It's unclear how many users might be affected by this bug. There had tobe at least three indexes on the table to hit the bug: a smaller index,plus at least two additional indexes that themselves exceed the sizecutoff. Cases with just one additional index would not run intotrouble, since the parallel VACUUM cost model requires twolarger-than-cutoff indexes on the table to apply any parallelprocessing. Note also that autovacuum was not affected, since it neveruses parallel processing.Test case based on tests from a larger patch to test parallel VACUUM byMasahiko Sawada.Many thanks to Kamigishi Rei for her invaluable help with tracking thisproblem down.Author: Peter Geoghegan <pg@bowt.ie>Author: Masahiko Sawada <sawada.mshk@gmail.com>Reported-By: Kamigishi Rei <iijima.yun@koumakan.jp>Reported-By: Andrew Gierth <andrew@tao11.riddles.org.uk>Diagnosed-By: Andres Freund <andres@anarazel.de>Bug: #17245Discussion:https://postgr.es/m/17245-ddf06aaf85735f36@postgresql.orgDiscussion:https://postgr.es/m/20211030023740.qbnsl2xaoh2grq3d@alap3.anarazel.deBackpatch: 14-, where the refactoring commit appears.
1 parentf3d4019 commit9bacec1

File tree

5 files changed

+132
-26
lines changed

5 files changed

+132
-26
lines changed

‎src/backend/access/heap/vacuumlazy.c

Lines changed: 35 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -452,7 +452,7 @@ static bool heap_page_is_all_visible(LVRelState *vacrel, Buffer buf,
452452
TransactionId*visibility_cutoff_xid,bool*all_frozen);
453453
staticintcompute_parallel_vacuum_workers(LVRelState*vacrel,
454454
intnrequested,
455-
bool*can_parallel_vacuum);
455+
bool*will_parallel_vacuum);
456456
staticvoidupdate_index_statistics(LVRelState*vacrel);
457457
staticLVParallelState*begin_parallel_vacuum(LVRelState*vacrel,
458458
BlockNumbernblocks,
@@ -2636,8 +2636,8 @@ do_parallel_lazy_vacuum_all_indexes(LVRelState *vacrel)
26362636
vacrel->lps->lvshared->first_time= false;
26372637

26382638
/*
2639-
* We can only provide an approximate value of num_heap_tuples in vacuum
2640-
*cases.
2639+
* We can only provide an approximate value of num_heap_tuples, at least
2640+
*for now. Matches serial VACUUM case.
26412641
*/
26422642
vacrel->lps->lvshared->reltuples=vacrel->old_live_tuples;
26432643
vacrel->lps->lvshared->estimated_count= true;
@@ -2825,7 +2825,7 @@ do_parallel_processing(LVRelState *vacrel, LVShared *lvshared)
28252825
if (idx >=vacrel->nindexes)
28262826
break;
28272827

2828-
/* Get the index statisticsof this indexfrom DSM */
2828+
/* Get the index statisticsspacefrom DSM, if any */
28292829
shared_istat=parallel_stats_for_idx(lvshared,idx);
28302830

28312831
/* Skip indexes not participating in parallelism */
@@ -2858,8 +2858,15 @@ do_parallel_processing(LVRelState *vacrel, LVShared *lvshared)
28582858
}
28592859

28602860
/*
2861-
* Vacuum or cleanup indexes that can be processed by only the leader process
2862-
* because these indexes don't support parallel operation at that phase.
2861+
* Perform parallel processing of indexes in leader process.
2862+
*
2863+
* Handles index vacuuming (or index cleanup) for indexes that are not
2864+
* parallel safe. It's possible that this will vary for a given index, based
2865+
* on details like whether we're performing for_cleanup processing right now.
2866+
*
2867+
* Also performs processing of smaller indexes that fell under the size cutoff
2868+
* enforced by compute_parallel_vacuum_workers(). These indexes never get a
2869+
* slot for statistics in DSM.
28632870
*/
28642871
staticvoid
28652872
do_serial_processing_for_unsafe_indexes(LVRelState*vacrel,LVShared*lvshared)
@@ -2879,17 +2886,15 @@ do_serial_processing_for_unsafe_indexes(LVRelState *vacrel, LVShared *lvshared)
28792886
IndexBulkDeleteResult*istat;
28802887

28812888
shared_istat=parallel_stats_for_idx(lvshared,idx);
2882-
2883-
/* Skip already-complete indexes */
2884-
if (shared_istat!=NULL)
2885-
continue;
2886-
28872889
indrel=vacrel->indrels[idx];
28882890

28892891
/*
2890-
* We're only here for the unsafe indexes
2892+
* We're only here for the indexes that parallel workers won't
2893+
* process. Note that the shared_istat test ensures that we process
2894+
* indexes that fell under initial size cutoff.
28912895
*/
2892-
if (parallel_processing_is_safe(indrel,lvshared))
2896+
if (shared_istat!=NULL&&
2897+
parallel_processing_is_safe(indrel,lvshared))
28932898
continue;
28942899

28952900
/* Do vacuum or cleanup of the index */
@@ -3730,12 +3735,12 @@ heap_page_is_all_visible(LVRelState *vacrel, Buffer buf,
37303735
* nrequested is the number of parallel workers that user requested. If
37313736
* nrequested is 0, we compute the parallel degree based on nindexes, that is
37323737
* the number of indexes that support parallel vacuum. This function also
3733-
* setscan_parallel_vacuum to remember indexes that participate in parallel
3738+
* setswill_parallel_vacuum to remember indexes that participate in parallel
37343739
* vacuum.
37353740
*/
37363741
staticint
37373742
compute_parallel_vacuum_workers(LVRelState*vacrel,intnrequested,
3738-
bool*can_parallel_vacuum)
3743+
bool*will_parallel_vacuum)
37393744
{
37403745
intnindexes_parallel=0;
37413746
intnindexes_parallel_bulkdel=0;
@@ -3761,7 +3766,7 @@ compute_parallel_vacuum_workers(LVRelState *vacrel, int nrequested,
37613766
RelationGetNumberOfBlocks(indrel)<min_parallel_index_scan_size)
37623767
continue;
37633768

3764-
can_parallel_vacuum[idx]= true;
3769+
will_parallel_vacuum[idx]= true;
37653770

37663771
if ((vacoptions&VACUUM_OPTION_PARALLEL_BULKDEL)!=0)
37673772
nindexes_parallel_bulkdel++;
@@ -3839,7 +3844,7 @@ begin_parallel_vacuum(LVRelState *vacrel, BlockNumber nblocks,
38393844
LVDeadTuples*dead_tuples;
38403845
BufferUsage*buffer_usage;
38413846
WalUsage*wal_usage;
3842-
bool*can_parallel_vacuum;
3847+
bool*will_parallel_vacuum;
38433848
longmaxtuples;
38443849
Sizeest_shared;
38453850
Sizeest_deadtuples;
@@ -3857,15 +3862,15 @@ begin_parallel_vacuum(LVRelState *vacrel, BlockNumber nblocks,
38573862
/*
38583863
* Compute the number of parallel vacuum workers to launch
38593864
*/
3860-
can_parallel_vacuum= (bool*)palloc0(sizeof(bool)*nindexes);
3865+
will_parallel_vacuum= (bool*)palloc0(sizeof(bool)*nindexes);
38613866
parallel_workers=compute_parallel_vacuum_workers(vacrel,
38623867
nrequested,
3863-
can_parallel_vacuum);
3868+
will_parallel_vacuum);
38643869

38653870
/* Can't perform vacuum in parallel */
38663871
if (parallel_workers <=0)
38673872
{
3868-
pfree(can_parallel_vacuum);
3873+
pfree(will_parallel_vacuum);
38693874
returnlps;
38703875
}
38713876

@@ -3893,7 +3898,7 @@ begin_parallel_vacuum(LVRelState *vacrel, BlockNumber nblocks,
38933898
Assert(vacoptions <=VACUUM_OPTION_MAX_VALID_VALUE);
38943899

38953900
/* Skip indexes that don't participate in parallel vacuum */
3896-
if (!can_parallel_vacuum[idx])
3901+
if (!will_parallel_vacuum[idx])
38973902
continue;
38983903

38993904
if (indrel->rd_indam->amusemaintenanceworkmem)
@@ -3970,7 +3975,7 @@ begin_parallel_vacuum(LVRelState *vacrel, BlockNumber nblocks,
39703975
memset(shared->bitmap,0x00,BITMAPLEN(nindexes));
39713976
for (intidx=0;idx<nindexes;idx++)
39723977
{
3973-
if (!can_parallel_vacuum[idx])
3978+
if (!will_parallel_vacuum[idx])
39743979
continue;
39753980

39763981
/* Set NOT NULL as this index does support parallelism */
@@ -4013,7 +4018,7 @@ begin_parallel_vacuum(LVRelState *vacrel, BlockNumber nblocks,
40134018
PARALLEL_VACUUM_KEY_QUERY_TEXT,sharedquery);
40144019
}
40154020

4016-
pfree(can_parallel_vacuum);
4021+
pfree(will_parallel_vacuum);
40174022
returnlps;
40184023
}
40194024

@@ -4043,8 +4048,8 @@ end_parallel_vacuum(LVRelState *vacrel)
40434048
shared_istat=parallel_stats_for_idx(lps->lvshared,idx);
40444049

40454050
/*
4046-
* Skipunused slot. The statistics of this index are already stored
4047-
*in local memory.
4051+
* Skipindex -- it must have been processed by the leader, from
4052+
*inside do_serial_processing_for_unsafe_indexes()
40484053
*/
40494054
if (shared_istat==NULL)
40504055
continue;
@@ -4068,6 +4073,11 @@ end_parallel_vacuum(LVRelState *vacrel)
40684073

40694074
/*
40704075
* Return shared memory statistics for index at offset 'getidx', if any
4076+
*
4077+
* Returning NULL indicates that compute_parallel_vacuum_workers() determined
4078+
* that the index is a totally unsuitable target for all parallel processing
4079+
* up front. For example, the index could be < min_parallel_index_scan_size
4080+
* cutoff.
40714081
*/
40724082
staticLVSharedIndStats*
40734083
parallel_stats_for_idx(LVShared*lvshared,intgetidx)

‎src/include/commands/vacuum.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@
4040

4141
/*
4242
* bulkdelete can be performed in parallel. This option can be used by
43-
*IndexAm'sthat need to scanthe indexto delete the tuples.
43+
*index AMsthat need to scanindexesto delete tuples.
4444
*/
4545
#defineVACUUM_OPTION_PARALLEL_BULKDEL(1 << 0)
4646

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
SET max_parallel_maintenance_workers TO 4;
2+
SET min_parallel_index_scan_size TO '128kB';
3+
-- Bug #17245: Make sure that we don't totally fail to VACUUM individual indexes that
4+
-- happen to be below min_parallel_index_scan_size during parallel VACUUM:
5+
CREATE TABLE parallel_vacuum_table (a int) WITH (autovacuum_enabled = off);
6+
INSERT INTO parallel_vacuum_table SELECT i from generate_series(1, 10000) i;
7+
-- Parallel VACUUM will never be used unless there are at least two indexes
8+
-- that exceed min_parallel_index_scan_size. Create two such indexes, and
9+
-- a third index that is smaller than min_parallel_index_scan_size.
10+
CREATE INDEX regular_sized_index ON parallel_vacuum_table(a);
11+
CREATE INDEX typically_sized_index ON parallel_vacuum_table(a);
12+
-- Note: vacuum_in_leader_small_index can apply deduplication, making it ~3x
13+
-- smaller than the other indexes
14+
CREATE INDEX vacuum_in_leader_small_index ON parallel_vacuum_table((1));
15+
-- Verify (as best we can) that the cost model for parallel VACUUM
16+
-- will make our VACUUM run in parallel, while always leaving it up to the
17+
-- parallel leader to handle the vacuum_in_leader_small_index index:
18+
SELECT EXISTS (
19+
SELECT 1
20+
FROM pg_class
21+
WHERE oid = 'vacuum_in_leader_small_index'::regclass AND
22+
pg_relation_size(oid) <
23+
pg_size_bytes(current_setting('min_parallel_index_scan_size'))
24+
) as leader_will_handle_small_index;
25+
leader_will_handle_small_index
26+
--------------------------------
27+
t
28+
(1 row)
29+
30+
SELECT count(*) as trigger_parallel_vacuum_nindexes
31+
FROM pg_class
32+
WHERE oid in ('regular_sized_index'::regclass, 'typically_sized_index'::regclass) AND
33+
pg_relation_size(oid) >=
34+
pg_size_bytes(current_setting('min_parallel_index_scan_size'));
35+
trigger_parallel_vacuum_nindexes
36+
----------------------------------
37+
2
38+
(1 row)
39+
40+
-- Parallel VACUUM with B-Tree page deletions, ambulkdelete calls:
41+
DELETE FROM parallel_vacuum_table;
42+
VACUUM (PARALLEL 4, INDEX_CLEANUP ON) parallel_vacuum_table;
43+
-- Since vacuum_in_leader_small_index uses deduplication, we expect an
44+
-- assertion failure with bug #17245 (in the absence of bugfix):
45+
INSERT INTO parallel_vacuum_table SELECT i FROM generate_series(1, 10000) i;
46+
RESET max_parallel_maintenance_workers;
47+
RESET min_parallel_index_scan_size;
48+
-- Deliberately don't drop table, to get further coverage from tools like
49+
-- pg_amcheck in some testing scenarios

‎src/test/regress/parallel_schedule

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,7 @@ test: rules psql psql_crosstab amutils stats_ext collate.linux.utf8
9696
# run by itself so it can run parallel workers
9797
test: select_parallel
9898
test: write_parallel
99+
test: vacuum_parallel
99100

100101
# no relation related tests can be put in this group
101102
test: publication subscription
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
SET max_parallel_maintenance_workers TO4;
2+
SET min_parallel_index_scan_size TO'128kB';
3+
4+
-- Bug #17245: Make sure that we don't totally fail to VACUUM individual indexes that
5+
-- happen to be below min_parallel_index_scan_size during parallel VACUUM:
6+
CREATETABLEparallel_vacuum_table (aint) WITH (autovacuum_enabled= off);
7+
INSERT INTO parallel_vacuum_tableSELECT ifrom generate_series(1,10000) i;
8+
9+
-- Parallel VACUUM will never be used unless there are at least two indexes
10+
-- that exceed min_parallel_index_scan_size. Create two such indexes, and
11+
-- a third index that is smaller than min_parallel_index_scan_size.
12+
CREATEINDEXregular_sized_indexON parallel_vacuum_table(a);
13+
CREATEINDEXtypically_sized_indexON parallel_vacuum_table(a);
14+
-- Note: vacuum_in_leader_small_index can apply deduplication, making it ~3x
15+
-- smaller than the other indexes
16+
CREATEINDEXvacuum_in_leader_small_indexON parallel_vacuum_table((1));
17+
18+
-- Verify (as best we can) that the cost model for parallel VACUUM
19+
-- will make our VACUUM run in parallel, while always leaving it up to the
20+
-- parallel leader to handle the vacuum_in_leader_small_index index:
21+
SELECT EXISTS (
22+
SELECT1
23+
FROM pg_class
24+
WHEREoid='vacuum_in_leader_small_index'::regclassAND
25+
pg_relation_size(oid)<
26+
pg_size_bytes(current_setting('min_parallel_index_scan_size'))
27+
)as leader_will_handle_small_index;
28+
SELECTcount(*)as trigger_parallel_vacuum_nindexes
29+
FROM pg_class
30+
WHEREoidin ('regular_sized_index'::regclass,'typically_sized_index'::regclass)AND
31+
pg_relation_size(oid)>=
32+
pg_size_bytes(current_setting('min_parallel_index_scan_size'));
33+
34+
-- Parallel VACUUM with B-Tree page deletions, ambulkdelete calls:
35+
DELETEFROM parallel_vacuum_table;
36+
VACUUM (PARALLEL4, INDEX_CLEANUPON) parallel_vacuum_table;
37+
38+
-- Since vacuum_in_leader_small_index uses deduplication, we expect an
39+
-- assertion failure with bug #17245 (in the absence of bugfix):
40+
INSERT INTO parallel_vacuum_tableSELECT iFROM generate_series(1,10000) i;
41+
42+
RESET max_parallel_maintenance_workers;
43+
RESET min_parallel_index_scan_size;
44+
45+
-- Deliberately don't drop table, to get further coverage from tools like
46+
-- pg_amcheck in some testing scenarios

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp