Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit5c8c7c7

Browse files
committed
Fix assorted bugs in CREATE INDEX CONCURRENTLY.
This patch changes CREATE INDEX CONCURRENTLY so that the pg_indexflag changes it makes without exclusive lock on the index are made viaheap_inplace_update() rather than a normal transactional update. Thelatter is not very safe because moving the pg_index tuple could result inconcurrent SnapshotNow scans finding it twice or not at all, thus possiblyresulting in index corruption.In addition, fix various places in the code that ought to check to makesure that the indexes they are manipulating are valid and/or ready asappropriate. These represent bugs that have existed since 8.2, sincea failed CREATE INDEX CONCURRENTLY could leave a corrupt or invalidindex behind, and we ought not try to do anything that might fail withsuch an index.Also fix RelationReloadIndexInfo to ensure it copies all the pg_indexcolumns that are allowed to change after initial creation. Previously wecould have been left with stale values of some fields in an index relcacheentry. It's not clear whether this actually had any user-visibleconsequences, but it's at least a bug waiting to happen.This is a subset of a patch already applied in 9.2 and HEAD. Back-patchinto all earlier supported branches.Tom Lane and Andres Freund
1 parentfc9b2d0 commit5c8c7c7

File tree

11 files changed

+173
-57
lines changed

11 files changed

+173
-57
lines changed

‎src/backend/access/heap/README.HOT

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -404,6 +404,12 @@ from the index, as well as ensuring that no one can see any inconsistent
404404
rows in a broken HOT chain (the first condition is stronger than the
405405
second). Finally, we can mark the index valid for searches.
406406

407+
Note that we do not need to set pg_index.indcheckxmin in this code path,
408+
because we have outwaited any transactions that would need to avoid using
409+
the index. (indcheckxmin is only needed because non-concurrent CREATE
410+
INDEX doesn't want to wait; its stronger lock would create too much risk of
411+
deadlock if it did.)
412+
407413

408414
Limitations and Restrictions
409415
----------------------------

‎src/backend/catalog/index.c

Lines changed: 84 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -995,7 +995,7 @@ BuildIndexInfo(Relation index)
995995

996996
/* other info */
997997
ii->ii_Unique=indexStruct->indisunique;
998-
ii->ii_ReadyForInserts=indexStruct->indisready;
998+
ii->ii_ReadyForInserts=IndexIsReady(indexStruct);
999999

10001000
/* initialize index-build state to default */
10011001
ii->ii_Concurrent= false;
@@ -1397,8 +1397,20 @@ index_build(Relation heapRelation,
13971397
* index's usability horizon. Moreover, we *must not* try to change
13981398
* the index's pg_index entry while reindexing pg_index itself, and this
13991399
* optimization nicely prevents that.
1400-
*/
1401-
if (indexInfo->ii_BrokenHotChain&& !isreindex)
1400+
*
1401+
* We also need not set indcheckxmin during a concurrent index build,
1402+
* because we won't set indisvalid true until all transactions that care
1403+
* about the broken HOT chains are gone.
1404+
*
1405+
* Therefore, this code path can only be taken during non-concurrent
1406+
* CREATE INDEX. Thus the fact that heap_update will set the pg_index
1407+
* tuple's xmin doesn't matter, because that tuple was created in the
1408+
* current transaction anyway.That also means we don't need to worry
1409+
* about any concurrent readers of the tuple; no other transaction can see
1410+
* it yet.
1411+
*/
1412+
if (indexInfo->ii_BrokenHotChain&& !isreindex&&
1413+
!indexInfo->ii_Concurrent)
14021414
{
14031415
OidindexId=RelationGetRelid(indexRelation);
14041416
Relationpg_index;
@@ -2208,6 +2220,66 @@ validate_index_heapscan(Relation heapRelation,
22082220
}
22092221

22102222

2223+
/*
2224+
* index_set_state_flags - adjust pg_index state flags
2225+
*
2226+
* This is used during CREATE INDEX CONCURRENTLY to adjust the pg_index
2227+
* flags that denote the index's state. We must use an in-place update of
2228+
* the pg_index tuple, because we do not have exclusive lock on the parent
2229+
* table and so other sessions might concurrently be doing SnapshotNow scans
2230+
* of pg_index to identify the table's indexes. A transactional update would
2231+
* risk somebody not seeing the index at all. Because the update is not
2232+
* transactional and will not roll back on error, this must only be used as
2233+
* the last step in a transaction that has not made any transactional catalog
2234+
* updates!
2235+
*
2236+
* Note that heap_inplace_update does send a cache inval message for the
2237+
* tuple, so other sessions will hear about the update as soon as we commit.
2238+
*/
2239+
void
2240+
index_set_state_flags(OidindexId,IndexStateFlagsActionaction)
2241+
{
2242+
Relationpg_index;
2243+
HeapTupleindexTuple;
2244+
Form_pg_indexindexForm;
2245+
2246+
/* Assert that current xact hasn't done any transactional updates */
2247+
Assert(GetTopTransactionIdIfAny()==InvalidTransactionId);
2248+
2249+
/* Open pg_index and fetch a writable copy of the index's tuple */
2250+
pg_index=heap_open(IndexRelationId,RowExclusiveLock);
2251+
2252+
indexTuple=SearchSysCacheCopy(INDEXRELID,
2253+
ObjectIdGetDatum(indexId),
2254+
0,0,0);
2255+
if (!HeapTupleIsValid(indexTuple))
2256+
elog(ERROR,"cache lookup failed for index %u",indexId);
2257+
indexForm= (Form_pg_index)GETSTRUCT(indexTuple);
2258+
2259+
/* Perform the requested state change on the copy */
2260+
switch (action)
2261+
{
2262+
caseINDEX_CREATE_SET_READY:
2263+
/* Set indisready during a CREATE INDEX CONCURRENTLY sequence */
2264+
Assert(!indexForm->indisready);
2265+
Assert(!indexForm->indisvalid);
2266+
indexForm->indisready= true;
2267+
break;
2268+
caseINDEX_CREATE_SET_VALID:
2269+
/* Set indisvalid during a CREATE INDEX CONCURRENTLY sequence */
2270+
Assert(indexForm->indisready);
2271+
Assert(!indexForm->indisvalid);
2272+
indexForm->indisvalid= true;
2273+
break;
2274+
}
2275+
2276+
/* ... and write it back in-place */
2277+
heap_inplace_update(pg_index,indexTuple);
2278+
2279+
heap_close(pg_index,RowExclusiveLock);
2280+
}
2281+
2282+
22112283
/*
22122284
* IndexGetRelation: given an index's relation OID, get the OID of the
22132285
* relation it is an index on.Uses the system cache.
@@ -2357,6 +2429,15 @@ reindex_index(Oid indexId)
23572429
indexForm->indisready= true;
23582430
simple_heap_update(pg_index,&indexTuple->t_self,indexTuple);
23592431
CatalogUpdateIndexes(pg_index,indexTuple);
2432+
2433+
/*
2434+
* Invalidate the relcache for the table, so that after we commit
2435+
* all sessions will refresh the table's index list. This ensures
2436+
* that if anyone misses seeing the pg_index row during this
2437+
* update, they'll refresh their list before attempting any update
2438+
* on the table.
2439+
*/
2440+
CacheInvalidateRelcache(heapRelation);
23602441
}
23612442
heap_close(pg_index,RowExclusiveLock);
23622443

‎src/backend/commands/cluster.c

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -431,7 +431,7 @@ check_index_is_clusterable(Relation OldHeap, Oid indexOid, bool recheck)
431431
* might put recently-dead tuples out-of-order in the new table, and there
432432
* is little harm in that.)
433433
*/
434-
if (!OldIndex->rd_index->indisvalid)
434+
if (!IndexIsValid(OldIndex->rd_index))
435435
ereport(ERROR,
436436
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
437437
errmsg("cannot cluster on invalid index \"%s\"",
@@ -473,6 +473,11 @@ check_index_is_clusterable(Relation OldHeap, Oid indexOid, bool recheck)
473473
* mark_index_clustered: mark the specified index as the one clustered on
474474
*
475475
* With indexOid == InvalidOid, will mark all indexes of rel not-clustered.
476+
*
477+
* Note: we do transactional updates of the pg_index rows, which are unsafe
478+
* against concurrent SnapshotNow scans of pg_index. Therefore this is unsafe
479+
* to execute with less than full exclusive lock on the parent table;
480+
* otherwise concurrent executions of RelationGetIndexList could miss indexes.
476481
*/
477482
void
478483
mark_index_clustered(Relationrel,OidindexOid)
@@ -533,6 +538,9 @@ mark_index_clustered(Relation rel, Oid indexOid)
533538
}
534539
elseif (thisIndexOid==indexOid)
535540
{
541+
/* this was checked earlier, but let's be real sure */
542+
if (!IndexIsValid(indexForm))
543+
elog(ERROR,"cannot cluster on invalid index %u",indexOid);
536544
indexForm->indisclustered= true;
537545
simple_heap_update(pg_index,&indexTuple->t_self,indexTuple);
538546
CatalogUpdateIndexes(pg_index,indexTuple);

‎src/backend/commands/indexcmds.c

Lines changed: 3 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -131,9 +131,6 @@ DefineIndex(RangeVar *heapRelation,
131131
LockRelIdheaprelid;
132132
LOCKTAGheaplocktag;
133133
Snapshotsnapshot;
134-
Relationpg_index;
135-
HeapTupleindexTuple;
136-
Form_pg_indexindexForm;
137134

138135
/*
139136
* count attributes in index
@@ -559,24 +556,7 @@ DefineIndex(RangeVar *heapRelation,
559556
* commit this transaction, any new transactions that open the table must
560557
* insert new entries into the index for insertions and non-HOT updates.
561558
*/
562-
pg_index=heap_open(IndexRelationId,RowExclusiveLock);
563-
564-
indexTuple=SearchSysCacheCopy(INDEXRELID,
565-
ObjectIdGetDatum(indexRelationId),
566-
0,0,0);
567-
if (!HeapTupleIsValid(indexTuple))
568-
elog(ERROR,"cache lookup failed for index %u",indexRelationId);
569-
indexForm= (Form_pg_index)GETSTRUCT(indexTuple);
570-
571-
Assert(!indexForm->indisready);
572-
Assert(!indexForm->indisvalid);
573-
574-
indexForm->indisready= true;
575-
576-
simple_heap_update(pg_index,&indexTuple->t_self,indexTuple);
577-
CatalogUpdateIndexes(pg_index,indexTuple);
578-
579-
heap_close(pg_index,RowExclusiveLock);
559+
index_set_state_flags(indexRelationId,INDEX_CREATE_SET_READY);
580560

581561
/*
582562
* Commit this transaction to make the indisready update visible.
@@ -654,32 +634,15 @@ DefineIndex(RangeVar *heapRelation,
654634
/*
655635
* Index can now be marked valid -- update its pg_index entry
656636
*/
657-
pg_index=heap_open(IndexRelationId,RowExclusiveLock);
658-
659-
indexTuple=SearchSysCacheCopy(INDEXRELID,
660-
ObjectIdGetDatum(indexRelationId),
661-
0,0,0);
662-
if (!HeapTupleIsValid(indexTuple))
663-
elog(ERROR,"cache lookup failed for index %u",indexRelationId);
664-
indexForm= (Form_pg_index)GETSTRUCT(indexTuple);
665-
666-
Assert(indexForm->indisready);
667-
Assert(!indexForm->indisvalid);
668-
669-
indexForm->indisvalid= true;
670-
671-
simple_heap_update(pg_index,&indexTuple->t_self,indexTuple);
672-
CatalogUpdateIndexes(pg_index,indexTuple);
673-
674-
heap_close(pg_index,RowExclusiveLock);
637+
index_set_state_flags(indexRelationId,INDEX_CREATE_SET_VALID);
675638

676639
/*
677640
* The pg_index update will cause backends (including this one) to update
678641
* relcache entries for the index itself, but we should also send a
679642
* relcache inval on the parent table to force replanning of cached plans.
680643
* Otherwise existing sessions might fail to use the new index where it
681644
* would be useful. (Note that our earlier commits did not create reasons
682-
* to replan; relcache flush on the index itself was sufficient.)
645+
* to replan;sorelcache flush on the index itself was sufficient.)
683646
*/
684647
CacheInvalidateRelcacheByRelid(heaprelid.relId);
685648

‎src/backend/commands/tablecmds.c

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3228,6 +3228,8 @@ ATExecDropNotNull(Relation rel, const char *colName)
32283228

32293229
/*
32303230
* Check that the attribute is not in a primary key
3231+
*
3232+
* Note: we'll throw error even if the pkey index is not valid.
32313233
*/
32323234

32333235
/* Loop over all indexes on the relation */
@@ -4226,7 +4228,7 @@ transformFkeyGetPrimaryKey(Relation pkrel, Oid *indexOid,
42264228
/*
42274229
* Get the list of index OIDs for the table from the relcache, and look up
42284230
* each one in the pg_index syscache until we find one marked primary key
4229-
* (hopefully there isn't more than one such).
4231+
* (hopefully there isn't more than one such). Insist it's valid, too.
42304232
*/
42314233
*indexOid=InvalidOid;
42324234

@@ -4242,7 +4244,7 @@ transformFkeyGetPrimaryKey(Relation pkrel, Oid *indexOid,
42424244
if (!HeapTupleIsValid(indexTuple))
42434245
elog(ERROR,"cache lookup failed for index %u",indexoid);
42444246
indexStruct= (Form_pg_index)GETSTRUCT(indexTuple);
4245-
if (indexStruct->indisprimary)
4247+
if (indexStruct->indisprimary&&IndexIsValid(indexStruct))
42464248
{
42474249
*indexOid=indexoid;
42484250
break;
@@ -4330,10 +4332,12 @@ transformFkeyCheckAttrs(Relation pkrel,
43304332

43314333
/*
43324334
* Must have the right number of columns; must be unique and not a
4333-
* partial index; forget it if there are any expressions, too
4335+
* partial index; forget it if there are any expressions, too. Invalid
4336+
* indexes are out as well.
43344337
*/
43354338
if (indexStruct->indnatts==numattrs&&
43364339
indexStruct->indisunique&&
4340+
IndexIsValid(indexStruct)&&
43374341
heap_attisnull(indexTuple,Anum_pg_index_indpred)&&
43384342
heap_attisnull(indexTuple,Anum_pg_index_indexprs))
43394343
{

‎src/backend/commands/vacuum.c

Lines changed: 23 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -3672,9 +3672,16 @@ vac_cmp_vtlinks(const void *left, const void *right)
36723672

36733673

36743674
/*
3675-
* Open all the indexes of the given relation, obtaining the specified kind
3676-
* of lock on each. Return an array of Relation pointers for the indexes
3677-
* into *Irel, and the number of indexes into *nindexes.
3675+
* Open all the vacuumable indexes of the given relation, obtaining the
3676+
* specified kind of lock on each.Return an array of Relation pointers for
3677+
* the indexes into *Irel, and the number of indexes into *nindexes.
3678+
*
3679+
* We consider an index vacuumable if it is marked insertable (IndexIsReady).
3680+
* If it isn't, probably a CREATE INDEX CONCURRENTLY command failed early in
3681+
* execution, and what we have is too corrupt to be processable. We will
3682+
* vacuum even if the index isn't indisvalid; this is important because in a
3683+
* unique index, uniqueness checks will be performed anyway and had better not
3684+
* hit dangling index pointers.
36783685
*/
36793686
void
36803687
vac_open_indexes(Relationrelation,LOCKMODElockmode,
@@ -3688,21 +3695,30 @@ vac_open_indexes(Relation relation, LOCKMODE lockmode,
36883695

36893696
indexoidlist=RelationGetIndexList(relation);
36903697

3691-
*nindexes=list_length(indexoidlist);
3698+
/* allocate enough memory for all indexes */
3699+
i=list_length(indexoidlist);
36923700

3693-
if (*nindexes>0)
3694-
*Irel= (Relation*)palloc(*nindexes*sizeof(Relation));
3701+
if (i>0)
3702+
*Irel= (Relation*)palloc(i*sizeof(Relation));
36953703
else
36963704
*Irel=NULL;
36973705

3706+
/* collect just the ready indexes */
36983707
i=0;
36993708
foreach(indexoidscan,indexoidlist)
37003709
{
37013710
Oidindexoid=lfirst_oid(indexoidscan);
3711+
Relationindrel;
37023712

3703-
(*Irel)[i++]=index_open(indexoid,lockmode);
3713+
indrel=index_open(indexoid,lockmode);
3714+
if (IndexIsReady(indrel->rd_index))
3715+
(*Irel)[i++]=indrel;
3716+
else
3717+
index_close(indrel,lockmode);
37043718
}
37053719

3720+
*nindexes=i;
3721+
37063722
list_free(indexoidlist);
37073723
}
37083724

‎src/backend/executor/execUtils.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -921,6 +921,9 @@ ExecOpenIndices(ResultRelInfo *resultRelInfo)
921921
/*
922922
* For each index, open the index relation and save pg_index info. We
923923
* acquire RowExclusiveLock, signifying we will update the index.
924+
*
925+
* Note: we do this even if the index is not IndexIsReady; it's not worth
926+
* the trouble to optimize for the case where it isn't.
924927
*/
925928
i=0;
926929
foreach(l,indexoidlist)

‎src/backend/optimizer/util/plancat.c

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -159,9 +159,10 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
159159
* Ignore invalid indexes, since they can't safely be used for
160160
* queries. Note that this is OK because the data structure we
161161
* are constructing is only used by the planner --- the executor
162-
* still needs to insert into "invalid" indexes!
162+
* still needs to insert into "invalid" indexes, if they're marked
163+
* IndexIsReady.
163164
*/
164-
if (!index->indisvalid)
165+
if (!IndexIsValid(index))
165166
{
166167
index_close(indexRelation,NoLock);
167168
continue;

‎src/backend/utils/cache/relcache.c

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1700,9 +1700,20 @@ RelationReloadIndexInfo(Relation relation)
17001700
RelationGetRelid(relation));
17011701
index= (Form_pg_index)GETSTRUCT(tuple);
17021702

1703+
/*
1704+
* Basically, let's just copy all the bool fields. There are one or
1705+
* two of these that can't actually change in the current code, but
1706+
* it's not worth it to track exactly which ones they are. None of
1707+
* the array fields are allowed to change, though.
1708+
*/
1709+
relation->rd_index->indisunique=index->indisunique;
1710+
relation->rd_index->indisprimary=index->indisprimary;
1711+
relation->rd_index->indisclustered=index->indisclustered;
17031712
relation->rd_index->indisvalid=index->indisvalid;
17041713
relation->rd_index->indcheckxmin=index->indcheckxmin;
17051714
relation->rd_index->indisready=index->indisready;
1715+
1716+
/* Copy xmin too, as that is needed to make sense of indcheckxmin */
17061717
HeapTupleHeaderSetXmin(relation->rd_indextuple->t_data,
17071718
HeapTupleHeaderGetXmin(tuple->t_data));
17081719

@@ -3077,7 +3088,8 @@ RelationGetIndexList(Relation relation)
30773088
result=insert_ordered_oid(result,index->indexrelid);
30783089

30793090
/* Check to see if it is a unique, non-partial btree index on OID */
3080-
if (index->indnatts==1&&
3091+
if (IndexIsValid(index)&&
3092+
index->indnatts==1&&
30813093
index->indisunique&&
30823094
index->indkey.values[0]==ObjectIdAttributeNumber&&
30833095
index->indclass.values[0]==OID_BTREE_OPS_OID&&
@@ -3384,6 +3396,11 @@ RelationGetIndexAttrBitmap(Relation relation)
33843396

33853397
/*
33863398
* For each index, add referenced attributes to indexattrs.
3399+
*
3400+
* Note: we consider all indexes returned by RelationGetIndexList, even if
3401+
* they are not indisready or indisvalid. This is important because an
3402+
* index for which CREATE INDEX CONCURRENTLY has just started must be
3403+
* included in HOT-safety decisions (see README.HOT).
33873404
*/
33883405
indexattrs=NULL;
33893406
foreach(l,indexoidlist)

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp