Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit2a46a0d

Browse files
committed
Cope with catcache entries becoming stale during detoasting.
We've long had a policy that any toasted fields in a catalog tupleshould be pulled in-line before entering the tuple in a catalog cache.However, that requires access to the catalog's toast table, and we'lltypically do AcceptInvalidationMessages while opening the toast table.So it's possible that the catalog tuple is outdated by the time wefinish detoasting it. Since no cache entry exists yet, we can'tmark the entry stale during AcceptInvalidationMessages, and insteadwe'll press forward and build an apparently-valid cache entry. Theupshot is that we have a race condition whereby an out-of-date entrycould be made in a backend's catalog cache, and persist thereindefinitely causing indeterminate misbehavior.To fix, use the existing systable_recheck_tuple code to recheckwhether the catalog tuple is still up-to-date after we finishdetoasting it. If not, loop around and restart the process ofsearching the catalog and constructing cache entries from the top.The case is rare enough that this shouldn't create any meaningfulperformance penalty, even in the SearchCatCacheList case wherewe need to tear down and reconstruct the whole list.Indeed, the case is so rare that AFAICT it doesn't occur duringour regression tests, and there doesn't seem to be any easy wayto build a test that would exercise it reliably. To allowtesting of the retry code paths, add logic (in USE_ASSERT_CHECKINGbuilds only) that randomly pretends that the recheck failed aboutone time out of a thousand. This is enough to ensure that we'llpass through the retry paths during most regression test runs.By adding an extra level of looping, this commit creates a needto reindent most of SearchCatCacheMiss and SearchCatCacheList.I'll do that separately, to allow putting those changes in.git-blame-ignore-revs.Patch by me; thanks to Alexander Lakhin for having built a testcase to prove the bug is real, and to Xiaoran Wang for review.Back-patch to all supported branches.Discussion:https://postgr.es/m/1393953.1698353013@sss.pgh.pa.usDiscussion:https://postgr.es/m/CAGjhLkOoBEC9mLsnB42d3CO1vcMx71MLSEuigeABbQ8oRdA6gw@mail.gmail.com
1 parent0765308 commit2a46a0d

File tree

1 file changed

+124
-40
lines changed

1 file changed

+124
-40
lines changed

‎src/backend/utils/cache/catcache.c

Lines changed: 124 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@
2525
#include"catalog/pg_operator.h"
2626
#include"catalog/pg_type.h"
2727
#include"common/hashfn.h"
28+
#include"common/pg_prng.h"
2829
#include"miscadmin.h"
2930
#include"port/pg_bitutils.h"
3031
#ifdefCATCACHE_STATS
@@ -90,10 +91,10 @@ static void CatCachePrintStats(int code, Datum arg);
9091
staticvoidCatCacheRemoveCTup(CatCache*cache,CatCTup*ct);
9192
staticvoidCatCacheRemoveCList(CatCache*cache,CatCList*cl);
9293
staticvoidCatalogCacheInitializeCache(CatCache*cache);
93-
staticCatCTup*CatalogCacheCreateEntry(CatCache*cache,HeapTuplentp,
94+
staticCatCTup*CatalogCacheCreateEntry(CatCache*cache,
95+
HeapTuplentp,SysScanDescscandesc,
9496
Datum*arguments,
95-
uint32hashValue,IndexhashIndex,
96-
boolnegative);
97+
uint32hashValue,IndexhashIndex);
9798

9899
staticvoidCatCacheFreeKeys(TupleDesctupdesc,intnkeys,int*attnos,
99100
Datum*keys);
@@ -1318,6 +1319,7 @@ SearchCatCacheMiss(CatCache *cache,
13181319
SysScanDescscandesc;
13191320
HeapTuplentp;
13201321
CatCTup*ct;
1322+
boolstale;
13211323
Datumarguments[CATCACHE_MAXKEYS];
13221324

13231325
/* Initialize local parameter array */
@@ -1326,16 +1328,6 @@ SearchCatCacheMiss(CatCache *cache,
13261328
arguments[2]=v3;
13271329
arguments[3]=v4;
13281330

1329-
/*
1330-
* Ok, need to make a lookup in the relation, copy the scankey and fill
1331-
* out any per-call fields.
1332-
*/
1333-
memcpy(cur_skey,cache->cc_skey,sizeof(ScanKeyData)*nkeys);
1334-
cur_skey[0].sk_argument=v1;
1335-
cur_skey[1].sk_argument=v2;
1336-
cur_skey[2].sk_argument=v3;
1337-
cur_skey[3].sk_argument=v4;
1338-
13391331
/*
13401332
* Tuple was not found in cache, so we have to try to retrieve it directly
13411333
* from the relation. If found, we will add it to the cache; if not
@@ -1350,9 +1342,28 @@ SearchCatCacheMiss(CatCache *cache,
13501342
* will eventually age out of the cache, so there's no functional problem.
13511343
* This case is rare enough that it's not worth expending extra cycles to
13521344
* detect.
1345+
*
1346+
* Another case, which we *must* handle, is that the tuple could become
1347+
* outdated during CatalogCacheCreateEntry's attempt to detoast it (since
1348+
* AcceptInvalidationMessages can run during TOAST table access). We do
1349+
* not want to return already-stale catcache entries, so we loop around
1350+
* and do the table scan again if that happens.
13531351
*/
13541352
relation=table_open(cache->cc_reloid,AccessShareLock);
13551353

1354+
do
1355+
{
1356+
/*
1357+
* Ok, need to make a lookup in the relation, copy the scankey and
1358+
* fill out any per-call fields. (We must re-do this when retrying,
1359+
* because systable_beginscan scribbles on the scankey.)
1360+
*/
1361+
memcpy(cur_skey,cache->cc_skey,sizeof(ScanKeyData)*nkeys);
1362+
cur_skey[0].sk_argument=v1;
1363+
cur_skey[1].sk_argument=v2;
1364+
cur_skey[2].sk_argument=v3;
1365+
cur_skey[3].sk_argument=v4;
1366+
13561367
scandesc=systable_beginscan(relation,
13571368
cache->cc_indexoid,
13581369
IndexScanOK(cache,cur_skey),
@@ -1361,12 +1372,18 @@ SearchCatCacheMiss(CatCache *cache,
13611372
cur_skey);
13621373

13631374
ct=NULL;
1375+
stale= false;
13641376

13651377
while (HeapTupleIsValid(ntp=systable_getnext(scandesc)))
13661378
{
1367-
ct=CatalogCacheCreateEntry(cache,ntp,arguments,
1368-
hashValue,hashIndex,
1369-
false);
1379+
ct=CatalogCacheCreateEntry(cache,ntp,scandesc,NULL,
1380+
hashValue,hashIndex);
1381+
/* upon failure, we must start the scan over */
1382+
if (ct==NULL)
1383+
{
1384+
stale= true;
1385+
break;
1386+
}
13701387
/* immediately set the refcount to 1 */
13711388
ResourceOwnerEnlargeCatCacheRefs(CurrentResourceOwner);
13721389
ct->refcount++;
@@ -1375,6 +1392,7 @@ SearchCatCacheMiss(CatCache *cache,
13751392
}
13761393

13771394
systable_endscan(scandesc);
1395+
}while (stale);
13781396

13791397
table_close(relation,AccessShareLock);
13801398

@@ -1393,9 +1411,11 @@ SearchCatCacheMiss(CatCache *cache,
13931411
if (IsBootstrapProcessingMode())
13941412
returnNULL;
13951413

1396-
ct=CatalogCacheCreateEntry(cache,NULL,arguments,
1397-
hashValue,hashIndex,
1398-
true);
1414+
ct=CatalogCacheCreateEntry(cache,NULL,NULL,arguments,
1415+
hashValue,hashIndex);
1416+
1417+
/* Creating a negative cache entry shouldn't fail */
1418+
Assert(ct!=NULL);
13991419

14001420
CACHE_elog(DEBUG2,"SearchCatCache(%s): Contains %d/%d tuples",
14011421
cache->cc_relname,cache->cc_ntup,CacheHdr->ch_ntup);
@@ -1602,7 +1622,8 @@ SearchCatCacheList(CatCache *cache,
16021622
* We have to bump the member refcounts temporarily to ensure they won't
16031623
* get dropped from the cache while loading other members. We use a PG_TRY
16041624
* block to ensure we can undo those refcounts if we get an error before
1605-
* we finish constructing the CatCList.
1625+
* we finish constructing the CatCList. ctlist must be valid throughout
1626+
* the PG_TRY block.
16061627
*/
16071628
ResourceOwnerEnlargeCatCacheListRefs(CurrentResourceOwner);
16081629

@@ -1613,19 +1634,23 @@ SearchCatCacheList(CatCache *cache,
16131634
ScanKeyDatacur_skey[CATCACHE_MAXKEYS];
16141635
Relationrelation;
16151636
SysScanDescscandesc;
1616-
1617-
/*
1618-
* Ok, need to make a lookup in the relation, copy the scankey and
1619-
* fill out any per-call fields.
1620-
*/
1621-
memcpy(cur_skey,cache->cc_skey,sizeof(ScanKeyData)*cache->cc_nkeys);
1622-
cur_skey[0].sk_argument=v1;
1623-
cur_skey[1].sk_argument=v2;
1624-
cur_skey[2].sk_argument=v3;
1625-
cur_skey[3].sk_argument=v4;
1637+
boolstale;
16261638

16271639
relation=table_open(cache->cc_reloid,AccessShareLock);
16281640

1641+
do
1642+
{
1643+
/*
1644+
* Ok, need to make a lookup in the relation, copy the scankey and
1645+
* fill out any per-call fields. (We must re-do this when
1646+
* retrying, because systable_beginscan scribbles on the scankey.)
1647+
*/
1648+
memcpy(cur_skey,cache->cc_skey,sizeof(ScanKeyData)*cache->cc_nkeys);
1649+
cur_skey[0].sk_argument=v1;
1650+
cur_skey[1].sk_argument=v2;
1651+
cur_skey[2].sk_argument=v3;
1652+
cur_skey[3].sk_argument=v4;
1653+
16291654
scandesc=systable_beginscan(relation,
16301655
cache->cc_indexoid,
16311656
IndexScanOK(cache,cur_skey),
@@ -1636,6 +1661,8 @@ SearchCatCacheList(CatCache *cache,
16361661
/* The list will be ordered iff we are doing an index scan */
16371662
ordered= (scandesc->irel!=NULL);
16381663

1664+
stale= false;
1665+
16391666
while (HeapTupleIsValid(ntp=systable_getnext(scandesc)))
16401667
{
16411668
uint32hashValue;
@@ -1678,9 +1705,32 @@ SearchCatCacheList(CatCache *cache,
16781705
if (!found)
16791706
{
16801707
/* We didn't find a usable entry, so make a new one */
1681-
ct=CatalogCacheCreateEntry(cache,ntp,arguments,
1682-
hashValue,hashIndex,
1683-
false);
1708+
ct=CatalogCacheCreateEntry(cache,ntp,scandesc,NULL,
1709+
hashValue,hashIndex);
1710+
/* upon failure, we must start the scan over */
1711+
if (ct==NULL)
1712+
{
1713+
/*
1714+
* Release refcounts on any items we already had. We dare
1715+
* not try to free them if they're now unreferenced, since
1716+
* an error while doing that would result in the PG_CATCH
1717+
* below doing extra refcount decrements. Besides, we'll
1718+
* likely re-adopt those items in the next iteration, so
1719+
* it's not worth complicating matters to try to get rid
1720+
* of them.
1721+
*/
1722+
foreach(ctlist_item,ctlist)
1723+
{
1724+
ct= (CatCTup*)lfirst(ctlist_item);
1725+
Assert(ct->c_list==NULL);
1726+
Assert(ct->refcount>0);
1727+
ct->refcount--;
1728+
}
1729+
/* Reset ctlist in preparation for new try */
1730+
ctlist=NIL;
1731+
stale= true;
1732+
break;
1733+
}
16841734
}
16851735

16861736
/* Careful here: add entry to ctlist, then bump its refcount */
@@ -1690,6 +1740,7 @@ SearchCatCacheList(CatCache *cache,
16901740
}
16911741

16921742
systable_endscan(scandesc);
1743+
}while (stale);
16931744

16941745
table_close(relation,AccessShareLock);
16951746

@@ -1796,22 +1847,42 @@ ReleaseCatCacheList(CatCList *list)
17961847
* CatalogCacheCreateEntry
17971848
*Create a new CatCTup entry, copying the given HeapTuple and other
17981849
*supplied data into it. The new entry initially has refcount 0.
1850+
*
1851+
* To create a normal cache entry, ntp must be the HeapTuple just fetched
1852+
* from scandesc, and "arguments" is not used. To create a negative cache
1853+
* entry, pass NULL for ntp and scandesc; then "arguments" is the cache
1854+
* keys to use. In either case, hashValue/hashIndex are the hash values
1855+
* computed from the cache keys.
1856+
*
1857+
* Returns NULL if we attempt to detoast the tuple and observe that it
1858+
* became stale. (This cannot happen for a negative entry.) Caller must
1859+
* retry the tuple lookup in that case.
17991860
*/
18001861
staticCatCTup*
1801-
CatalogCacheCreateEntry(CatCache*cache,HeapTuplentp,Datum*arguments,
1802-
uint32hashValue,IndexhashIndex,
1803-
boolnegative)
1862+
CatalogCacheCreateEntry(CatCache*cache,HeapTuplentp,SysScanDescscandesc,
1863+
Datum*arguments,
1864+
uint32hashValue,IndexhashIndex)
18041865
{
18051866
CatCTup*ct;
18061867
HeapTupledtp;
18071868
MemoryContextoldcxt;
18081869

1809-
/* negative entries have no tuple associated */
18101870
if (ntp)
18111871
{
18121872
inti;
18131873

1814-
Assert(!negative);
1874+
/*
1875+
* The visibility recheck below essentially never fails during our
1876+
* regression tests, and there's no easy way to force it to fail for
1877+
* testing purposes. To ensure we have test coverage for the retry
1878+
* paths in our callers, make debug builds randomly fail about 0.1% of
1879+
* the times through this code path, even when there's no toasted
1880+
* fields.
1881+
*/
1882+
#ifdefUSE_ASSERT_CHECKING
1883+
if (pg_prng_uint32(&pg_global_prng_state) <= (PG_UINT32_MAX /1000))
1884+
returnNULL;
1885+
#endif
18151886

18161887
/*
18171888
* If there are any out-of-line toasted fields in the tuple, expand
@@ -1821,7 +1892,20 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments,
18211892
* something using a slightly stale catcache entry.
18221893
*/
18231894
if (HeapTupleHasExternal(ntp))
1895+
{
18241896
dtp=toast_flatten_tuple(ntp,cache->cc_tupdesc);
1897+
1898+
/*
1899+
* The tuple could become stale while we are doing toast table
1900+
* access (since AcceptInvalidationMessages can run then), so we
1901+
* must recheck its visibility afterwards.
1902+
*/
1903+
if (!systable_recheck_tuple(scandesc,ntp))
1904+
{
1905+
heap_freetuple(dtp);
1906+
returnNULL;
1907+
}
1908+
}
18251909
else
18261910
dtp=ntp;
18271911

@@ -1860,7 +1944,7 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments,
18601944
}
18611945
else
18621946
{
1863-
Assert(negative);
1947+
/* Set up keys for anegative cache entry */
18641948
oldcxt=MemoryContextSwitchTo(CacheMemoryContext);
18651949
ct= (CatCTup*)palloc(sizeof(CatCTup));
18661950

@@ -1882,7 +1966,7 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments,
18821966
ct->c_list=NULL;
18831967
ct->refcount=0;/* for the moment */
18841968
ct->dead= false;
1885-
ct->negative=negative;
1969+
ct->negative=(ntp==NULL);
18861970
ct->hash_value=hashValue;
18871971

18881972
dlist_push_head(&cache->cc_bucket[hashIndex],&ct->cache_elem);

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp