Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit98e03f9

Browse files
committed
Cope with catcache entries becoming stale during detoasting.
We've long had a policy that any toasted fields in a catalog tupleshould be pulled in-line before entering the tuple in a catalog cache.However, that requires access to the catalog's toast table, and we'lltypically do AcceptInvalidationMessages while opening the toast table.So it's possible that the catalog tuple is outdated by the time wefinish detoasting it. Since no cache entry exists yet, we can'tmark the entry stale during AcceptInvalidationMessages, and insteadwe'll press forward and build an apparently-valid cache entry. Theupshot is that we have a race condition whereby an out-of-date entrycould be made in a backend's catalog cache, and persist thereindefinitely causing indeterminate misbehavior.To fix, use the existing systable_recheck_tuple code to recheckwhether the catalog tuple is still up-to-date after we finishdetoasting it. If not, loop around and restart the process ofsearching the catalog and constructing cache entries from the top.The case is rare enough that this shouldn't create any meaningfulperformance penalty, even in the SearchCatCacheList case wherewe need to tear down and reconstruct the whole list.Indeed, the case is so rare that AFAICT it doesn't occur duringour regression tests, and there doesn't seem to be any easy wayto build a test that would exercise it reliably. To allowtesting of the retry code paths, add logic (in USE_ASSERT_CHECKINGbuilds only) that randomly pretends that the recheck failed aboutone time out of a thousand. This is enough to ensure that we'llpass through the retry paths during most regression test runs.By adding an extra level of looping, this commit creates a needto reindent most of SearchCatCacheMiss and SearchCatCacheList.I'll do that separately, to allow putting those changes in.git-blame-ignore-revs.Patch by me; thanks to Alexander Lakhin for having built a testcase to prove the bug is real, and to Xiaoran Wang for review.Back-patch to all supported branches.Discussion:https://postgr.es/m/1393953.1698353013@sss.pgh.pa.usDiscussion:https://postgr.es/m/CAGjhLkOoBEC9mLsnB42d3CO1vcMx71MLSEuigeABbQ8oRdA6gw@mail.gmail.com
1 parent3386afa commit98e03f9

File tree

1 file changed

+123
-40
lines changed

1 file changed

+123
-40
lines changed

‎src/backend/utils/cache/catcache.c

Lines changed: 123 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -89,10 +89,10 @@ static void CatCachePrintStats(int code, Datum arg);
8989
staticvoidCatCacheRemoveCTup(CatCache*cache,CatCTup*ct);
9090
staticvoidCatCacheRemoveCList(CatCache*cache,CatCList*cl);
9191
staticvoidCatalogCacheInitializeCache(CatCache*cache);
92-
staticCatCTup*CatalogCacheCreateEntry(CatCache*cache,HeapTuplentp,
92+
staticCatCTup*CatalogCacheCreateEntry(CatCache*cache,
93+
HeapTuplentp,SysScanDescscandesc,
9394
Datum*arguments,
94-
uint32hashValue,IndexhashIndex,
95-
boolnegative);
95+
uint32hashValue,IndexhashIndex);
9696

9797
staticvoidCatCacheFreeKeys(TupleDesctupdesc,intnkeys,int*attnos,
9898
Datum*keys);
@@ -1322,6 +1322,7 @@ SearchCatCacheMiss(CatCache *cache,
13221322
SysScanDescscandesc;
13231323
HeapTuplentp;
13241324
CatCTup*ct;
1325+
boolstale;
13251326
Datumarguments[CATCACHE_MAXKEYS];
13261327

13271328
/* Initialize local parameter array */
@@ -1330,16 +1331,6 @@ SearchCatCacheMiss(CatCache *cache,
13301331
arguments[2]=v3;
13311332
arguments[3]=v4;
13321333

1333-
/*
1334-
* Ok, need to make a lookup in the relation, copy the scankey and fill
1335-
* out any per-call fields.
1336-
*/
1337-
memcpy(cur_skey,cache->cc_skey,sizeof(ScanKeyData)*nkeys);
1338-
cur_skey[0].sk_argument=v1;
1339-
cur_skey[1].sk_argument=v2;
1340-
cur_skey[2].sk_argument=v3;
1341-
cur_skey[3].sk_argument=v4;
1342-
13431334
/*
13441335
* Tuple was not found in cache, so we have to try to retrieve it directly
13451336
* from the relation. If found, we will add it to the cache; if not
@@ -1354,9 +1345,28 @@ SearchCatCacheMiss(CatCache *cache,
13541345
* will eventually age out of the cache, so there's no functional problem.
13551346
* This case is rare enough that it's not worth expending extra cycles to
13561347
* detect.
1348+
*
1349+
* Another case, which we *must* handle, is that the tuple could become
1350+
* outdated during CatalogCacheCreateEntry's attempt to detoast it (since
1351+
* AcceptInvalidationMessages can run during TOAST table access). We do
1352+
* not want to return already-stale catcache entries, so we loop around
1353+
* and do the table scan again if that happens.
13571354
*/
13581355
relation=table_open(cache->cc_reloid,AccessShareLock);
13591356

1357+
do
1358+
{
1359+
/*
1360+
* Ok, need to make a lookup in the relation, copy the scankey and
1361+
* fill out any per-call fields. (We must re-do this when retrying,
1362+
* because systable_beginscan scribbles on the scankey.)
1363+
*/
1364+
memcpy(cur_skey,cache->cc_skey,sizeof(ScanKeyData)*nkeys);
1365+
cur_skey[0].sk_argument=v1;
1366+
cur_skey[1].sk_argument=v2;
1367+
cur_skey[2].sk_argument=v3;
1368+
cur_skey[3].sk_argument=v4;
1369+
13601370
scandesc=systable_beginscan(relation,
13611371
cache->cc_indexoid,
13621372
IndexScanOK(cache,cur_skey),
@@ -1365,12 +1375,18 @@ SearchCatCacheMiss(CatCache *cache,
13651375
cur_skey);
13661376

13671377
ct=NULL;
1378+
stale= false;
13681379

13691380
while (HeapTupleIsValid(ntp=systable_getnext(scandesc)))
13701381
{
1371-
ct=CatalogCacheCreateEntry(cache,ntp,arguments,
1372-
hashValue,hashIndex,
1373-
false);
1382+
ct=CatalogCacheCreateEntry(cache,ntp,scandesc,NULL,
1383+
hashValue,hashIndex);
1384+
/* upon failure, we must start the scan over */
1385+
if (ct==NULL)
1386+
{
1387+
stale= true;
1388+
break;
1389+
}
13741390
/* immediately set the refcount to 1 */
13751391
ResourceOwnerEnlargeCatCacheRefs(CurrentResourceOwner);
13761392
ct->refcount++;
@@ -1379,6 +1395,7 @@ SearchCatCacheMiss(CatCache *cache,
13791395
}
13801396

13811397
systable_endscan(scandesc);
1398+
}while (stale);
13821399

13831400
table_close(relation,AccessShareLock);
13841401

@@ -1397,9 +1414,11 @@ SearchCatCacheMiss(CatCache *cache,
13971414
if (IsBootstrapProcessingMode())
13981415
returnNULL;
13991416

1400-
ct=CatalogCacheCreateEntry(cache,NULL,arguments,
1401-
hashValue,hashIndex,
1402-
true);
1417+
ct=CatalogCacheCreateEntry(cache,NULL,NULL,arguments,
1418+
hashValue,hashIndex);
1419+
1420+
/* Creating a negative cache entry shouldn't fail */
1421+
Assert(ct!=NULL);
14031422

14041423
CACHE_elog(DEBUG2,"SearchCatCache(%s): Contains %d/%d tuples",
14051424
cache->cc_relname,cache->cc_ntup,CacheHdr->ch_ntup);
@@ -1606,7 +1625,8 @@ SearchCatCacheList(CatCache *cache,
16061625
* We have to bump the member refcounts temporarily to ensure they won't
16071626
* get dropped from the cache while loading other members. We use a PG_TRY
16081627
* block to ensure we can undo those refcounts if we get an error before
1609-
* we finish constructing the CatCList.
1628+
* we finish constructing the CatCList. ctlist must be valid throughout
1629+
* the PG_TRY block.
16101630
*/
16111631
ResourceOwnerEnlargeCatCacheListRefs(CurrentResourceOwner);
16121632

@@ -1617,19 +1637,23 @@ SearchCatCacheList(CatCache *cache,
16171637
ScanKeyDatacur_skey[CATCACHE_MAXKEYS];
16181638
Relationrelation;
16191639
SysScanDescscandesc;
1620-
1621-
/*
1622-
* Ok, need to make a lookup in the relation, copy the scankey and
1623-
* fill out any per-call fields.
1624-
*/
1625-
memcpy(cur_skey,cache->cc_skey,sizeof(ScanKeyData)*cache->cc_nkeys);
1626-
cur_skey[0].sk_argument=v1;
1627-
cur_skey[1].sk_argument=v2;
1628-
cur_skey[2].sk_argument=v3;
1629-
cur_skey[3].sk_argument=v4;
1640+
boolstale;
16301641

16311642
relation=table_open(cache->cc_reloid,AccessShareLock);
16321643

1644+
do
1645+
{
1646+
/*
1647+
* Ok, need to make a lookup in the relation, copy the scankey and
1648+
* fill out any per-call fields. (We must re-do this when
1649+
* retrying, because systable_beginscan scribbles on the scankey.)
1650+
*/
1651+
memcpy(cur_skey,cache->cc_skey,sizeof(ScanKeyData)*cache->cc_nkeys);
1652+
cur_skey[0].sk_argument=v1;
1653+
cur_skey[1].sk_argument=v2;
1654+
cur_skey[2].sk_argument=v3;
1655+
cur_skey[3].sk_argument=v4;
1656+
16331657
scandesc=systable_beginscan(relation,
16341658
cache->cc_indexoid,
16351659
IndexScanOK(cache,cur_skey),
@@ -1640,6 +1664,8 @@ SearchCatCacheList(CatCache *cache,
16401664
/* The list will be ordered iff we are doing an index scan */
16411665
ordered= (scandesc->irel!=NULL);
16421666

1667+
stale= false;
1668+
16431669
while (HeapTupleIsValid(ntp=systable_getnext(scandesc)))
16441670
{
16451671
uint32hashValue;
@@ -1682,9 +1708,32 @@ SearchCatCacheList(CatCache *cache,
16821708
if (!found)
16831709
{
16841710
/* We didn't find a usable entry, so make a new one */
1685-
ct=CatalogCacheCreateEntry(cache,ntp,arguments,
1686-
hashValue,hashIndex,
1687-
false);
1711+
ct=CatalogCacheCreateEntry(cache,ntp,scandesc,NULL,
1712+
hashValue,hashIndex);
1713+
/* upon failure, we must start the scan over */
1714+
if (ct==NULL)
1715+
{
1716+
/*
1717+
* Release refcounts on any items we already had. We dare
1718+
* not try to free them if they're now unreferenced, since
1719+
* an error while doing that would result in the PG_CATCH
1720+
* below doing extra refcount decrements. Besides, we'll
1721+
* likely re-adopt those items in the next iteration, so
1722+
* it's not worth complicating matters to try to get rid
1723+
* of them.
1724+
*/
1725+
foreach(ctlist_item,ctlist)
1726+
{
1727+
ct= (CatCTup*)lfirst(ctlist_item);
1728+
Assert(ct->c_list==NULL);
1729+
Assert(ct->refcount>0);
1730+
ct->refcount--;
1731+
}
1732+
/* Reset ctlist in preparation for new try */
1733+
ctlist=NIL;
1734+
stale= true;
1735+
break;
1736+
}
16881737
}
16891738

16901739
/* Careful here: add entry to ctlist, then bump its refcount */
@@ -1694,6 +1743,7 @@ SearchCatCacheList(CatCache *cache,
16941743
}
16951744

16961745
systable_endscan(scandesc);
1746+
}while (stale);
16971747

16981748
table_close(relation,AccessShareLock);
16991749

@@ -1801,22 +1851,42 @@ ReleaseCatCacheList(CatCList *list)
18011851
* CatalogCacheCreateEntry
18021852
*Create a new CatCTup entry, copying the given HeapTuple and other
18031853
*supplied data into it. The new entry initially has refcount 0.
1854+
*
1855+
* To create a normal cache entry, ntp must be the HeapTuple just fetched
1856+
* from scandesc, and "arguments" is not used. To create a negative cache
1857+
* entry, pass NULL for ntp and scandesc; then "arguments" is the cache
1858+
* keys to use. In either case, hashValue/hashIndex are the hash values
1859+
* computed from the cache keys.
1860+
*
1861+
* Returns NULL if we attempt to detoast the tuple and observe that it
1862+
* became stale. (This cannot happen for a negative entry.) Caller must
1863+
* retry the tuple lookup in that case.
18041864
*/
18051865
staticCatCTup*
1806-
CatalogCacheCreateEntry(CatCache*cache,HeapTuplentp,Datum*arguments,
1807-
uint32hashValue,IndexhashIndex,
1808-
boolnegative)
1866+
CatalogCacheCreateEntry(CatCache*cache,HeapTuplentp,SysScanDescscandesc,
1867+
Datum*arguments,
1868+
uint32hashValue,IndexhashIndex)
18091869
{
18101870
CatCTup*ct;
18111871
HeapTupledtp;
18121872
MemoryContextoldcxt;
18131873

1814-
/* negative entries have no tuple associated */
18151874
if (ntp)
18161875
{
18171876
inti;
18181877

1819-
Assert(!negative);
1878+
/*
1879+
* The visibility recheck below essentially never fails during our
1880+
* regression tests, and there's no easy way to force it to fail for
1881+
* testing purposes. To ensure we have test coverage for the retry
1882+
* paths in our callers, make debug builds randomly fail about 0.1% of
1883+
* the times through this code path, even when there's no toasted
1884+
* fields.
1885+
*/
1886+
#ifdefUSE_ASSERT_CHECKING
1887+
if (random() <= (MAX_RANDOM_VALUE /1000))
1888+
returnNULL;
1889+
#endif
18201890

18211891
/*
18221892
* If there are any out-of-line toasted fields in the tuple, expand
@@ -1826,7 +1896,20 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments,
18261896
* something using a slightly stale catcache entry.
18271897
*/
18281898
if (HeapTupleHasExternal(ntp))
1899+
{
18291900
dtp=toast_flatten_tuple(ntp,cache->cc_tupdesc);
1901+
1902+
/*
1903+
* The tuple could become stale while we are doing toast table
1904+
* access (since AcceptInvalidationMessages can run then), so we
1905+
* must recheck its visibility afterwards.
1906+
*/
1907+
if (!systable_recheck_tuple(scandesc,ntp))
1908+
{
1909+
heap_freetuple(dtp);
1910+
returnNULL;
1911+
}
1912+
}
18301913
else
18311914
dtp=ntp;
18321915

@@ -1865,7 +1948,7 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments,
18651948
}
18661949
else
18671950
{
1868-
Assert(negative);
1951+
/* Set up keys for anegative cache entry */
18691952
oldcxt=MemoryContextSwitchTo(CacheMemoryContext);
18701953
ct= (CatCTup*)palloc(sizeof(CatCTup));
18711954

@@ -1887,7 +1970,7 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments,
18871970
ct->c_list=NULL;
18881971
ct->refcount=0;/* for the moment */
18891972
ct->dead= false;
1890-
ct->negative=negative;
1973+
ct->negative=(ntp==NULL);
18911974
ct->hash_value=hashValue;
18921975

18931976
dlist_push_head(&cache->cc_bucket[hashIndex],&ct->cache_elem);

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp