Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitad98fb1

Browse files
committed
Cope with catcache entries becoming stale during detoasting.
We've long had a policy that any toasted fields in a catalog tupleshould be pulled in-line before entering the tuple in a catalog cache.However, that requires access to the catalog's toast table, and we'lltypically do AcceptInvalidationMessages while opening the toast table.So it's possible that the catalog tuple is outdated by the time wefinish detoasting it. Since no cache entry exists yet, we can'tmark the entry stale during AcceptInvalidationMessages, and insteadwe'll press forward and build an apparently-valid cache entry. Theupshot is that we have a race condition whereby an out-of-date entrycould be made in a backend's catalog cache, and persist thereindefinitely causing indeterminate misbehavior.To fix, use the existing systable_recheck_tuple code to recheckwhether the catalog tuple is still up-to-date after we finishdetoasting it. If not, loop around and restart the process ofsearching the catalog and constructing cache entries from the top.The case is rare enough that this shouldn't create any meaningfulperformance penalty, even in the SearchCatCacheList case wherewe need to tear down and reconstruct the whole list.Indeed, the case is so rare that AFAICT it doesn't occur duringour regression tests, and there doesn't seem to be any easy wayto build a test that would exercise it reliably. To allowtesting of the retry code paths, add logic (in USE_ASSERT_CHECKINGbuilds only) that randomly pretends that the recheck failed aboutone time out of a thousand. This is enough to ensure that we'llpass through the retry paths during most regression test runs.By adding an extra level of looping, this commit creates a needto reindent most of SearchCatCacheMiss and SearchCatCacheList.I'll do that separately, to allow putting those changes in.git-blame-ignore-revs.Patch by me; thanks to Alexander Lakhin for having built a testcase to prove the bug is real, and to Xiaoran Wang for review.Back-patch to all supported branches.Discussion:https://postgr.es/m/1393953.1698353013@sss.pgh.pa.usDiscussion:https://postgr.es/m/CAGjhLkOoBEC9mLsnB42d3CO1vcMx71MLSEuigeABbQ8oRdA6gw@mail.gmail.com
1 parent4f62250 commitad98fb1

File tree

1 file changed

+124
-40
lines changed

1 file changed

+124
-40
lines changed

‎src/backend/utils/cache/catcache.c

Lines changed: 124 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@
2424
#include"catalog/pg_operator.h"
2525
#include"catalog/pg_type.h"
2626
#include"common/hashfn.h"
27+
#include"common/pg_prng.h"
2728
#include"miscadmin.h"
2829
#include"port/pg_bitutils.h"
2930
#ifdefCATCACHE_STATS
@@ -90,10 +91,10 @@ static void CatCachePrintStats(int code, Datum arg);
9091
staticvoidCatCacheRemoveCTup(CatCache*cache,CatCTup*ct);
9192
staticvoidCatCacheRemoveCList(CatCache*cache,CatCList*cl);
9293
staticvoidCatalogCacheInitializeCache(CatCache*cache);
93-
staticCatCTup*CatalogCacheCreateEntry(CatCache*cache,HeapTuplentp,
94+
staticCatCTup*CatalogCacheCreateEntry(CatCache*cache,
95+
HeapTuplentp,SysScanDescscandesc,
9496
Datum*arguments,
95-
uint32hashValue,IndexhashIndex,
96-
boolnegative);
97+
uint32hashValue,IndexhashIndex);
9798

9899
staticvoidReleaseCatCacheWithOwner(HeapTupletuple,ResourceOwnerresowner);
99100
staticvoidReleaseCatCacheListWithOwner(CatCList*list,ResourceOwnerresowner);
@@ -1372,6 +1373,7 @@ SearchCatCacheMiss(CatCache *cache,
13721373
SysScanDescscandesc;
13731374
HeapTuplentp;
13741375
CatCTup*ct;
1376+
boolstale;
13751377
Datumarguments[CATCACHE_MAXKEYS];
13761378

13771379
/* Initialize local parameter array */
@@ -1380,16 +1382,6 @@ SearchCatCacheMiss(CatCache *cache,
13801382
arguments[2]=v3;
13811383
arguments[3]=v4;
13821384

1383-
/*
1384-
* Ok, need to make a lookup in the relation, copy the scankey and fill
1385-
* out any per-call fields.
1386-
*/
1387-
memcpy(cur_skey,cache->cc_skey,sizeof(ScanKeyData)*nkeys);
1388-
cur_skey[0].sk_argument=v1;
1389-
cur_skey[1].sk_argument=v2;
1390-
cur_skey[2].sk_argument=v3;
1391-
cur_skey[3].sk_argument=v4;
1392-
13931385
/*
13941386
* Tuple was not found in cache, so we have to try to retrieve it directly
13951387
* from the relation. If found, we will add it to the cache; if not
@@ -1404,9 +1396,28 @@ SearchCatCacheMiss(CatCache *cache,
14041396
* will eventually age out of the cache, so there's no functional problem.
14051397
* This case is rare enough that it's not worth expending extra cycles to
14061398
* detect.
1399+
*
1400+
* Another case, which we *must* handle, is that the tuple could become
1401+
* outdated during CatalogCacheCreateEntry's attempt to detoast it (since
1402+
* AcceptInvalidationMessages can run during TOAST table access). We do
1403+
* not want to return already-stale catcache entries, so we loop around
1404+
* and do the table scan again if that happens.
14071405
*/
14081406
relation=table_open(cache->cc_reloid,AccessShareLock);
14091407

1408+
do
1409+
{
1410+
/*
1411+
* Ok, need to make a lookup in the relation, copy the scankey and
1412+
* fill out any per-call fields. (We must re-do this when retrying,
1413+
* because systable_beginscan scribbles on the scankey.)
1414+
*/
1415+
memcpy(cur_skey,cache->cc_skey,sizeof(ScanKeyData)*nkeys);
1416+
cur_skey[0].sk_argument=v1;
1417+
cur_skey[1].sk_argument=v2;
1418+
cur_skey[2].sk_argument=v3;
1419+
cur_skey[3].sk_argument=v4;
1420+
14101421
scandesc=systable_beginscan(relation,
14111422
cache->cc_indexoid,
14121423
IndexScanOK(cache,cur_skey),
@@ -1415,12 +1426,18 @@ SearchCatCacheMiss(CatCache *cache,
14151426
cur_skey);
14161427

14171428
ct=NULL;
1429+
stale= false;
14181430

14191431
while (HeapTupleIsValid(ntp=systable_getnext(scandesc)))
14201432
{
1421-
ct=CatalogCacheCreateEntry(cache,ntp,arguments,
1422-
hashValue,hashIndex,
1423-
false);
1433+
ct=CatalogCacheCreateEntry(cache,ntp,scandesc,NULL,
1434+
hashValue,hashIndex);
1435+
/* upon failure, we must start the scan over */
1436+
if (ct==NULL)
1437+
{
1438+
stale= true;
1439+
break;
1440+
}
14241441
/* immediately set the refcount to 1 */
14251442
ResourceOwnerEnlarge(CurrentResourceOwner);
14261443
ct->refcount++;
@@ -1429,6 +1446,7 @@ SearchCatCacheMiss(CatCache *cache,
14291446
}
14301447

14311448
systable_endscan(scandesc);
1449+
}while (stale);
14321450

14331451
table_close(relation,AccessShareLock);
14341452

@@ -1447,9 +1465,11 @@ SearchCatCacheMiss(CatCache *cache,
14471465
if (IsBootstrapProcessingMode())
14481466
returnNULL;
14491467

1450-
ct=CatalogCacheCreateEntry(cache,NULL,arguments,
1451-
hashValue,hashIndex,
1452-
true);
1468+
ct=CatalogCacheCreateEntry(cache,NULL,NULL,arguments,
1469+
hashValue,hashIndex);
1470+
1471+
/* Creating a negative cache entry shouldn't fail */
1472+
Assert(ct!=NULL);
14531473

14541474
CACHE_elog(DEBUG2,"SearchCatCache(%s): Contains %d/%d tuples",
14551475
cache->cc_relname,cache->cc_ntup,CacheHdr->ch_ntup);
@@ -1663,7 +1683,8 @@ SearchCatCacheList(CatCache *cache,
16631683
* We have to bump the member refcounts temporarily to ensure they won't
16641684
* get dropped from the cache while loading other members. We use a PG_TRY
16651685
* block to ensure we can undo those refcounts if we get an error before
1666-
* we finish constructing the CatCList.
1686+
* we finish constructing the CatCList. ctlist must be valid throughout
1687+
* the PG_TRY block.
16671688
*/
16681689
ctlist=NIL;
16691690

@@ -1672,19 +1693,23 @@ SearchCatCacheList(CatCache *cache,
16721693
ScanKeyDatacur_skey[CATCACHE_MAXKEYS];
16731694
Relationrelation;
16741695
SysScanDescscandesc;
1675-
1676-
/*
1677-
* Ok, need to make a lookup in the relation, copy the scankey and
1678-
* fill out any per-call fields.
1679-
*/
1680-
memcpy(cur_skey,cache->cc_skey,sizeof(ScanKeyData)*cache->cc_nkeys);
1681-
cur_skey[0].sk_argument=v1;
1682-
cur_skey[1].sk_argument=v2;
1683-
cur_skey[2].sk_argument=v3;
1684-
cur_skey[3].sk_argument=v4;
1696+
boolstale;
16851697

16861698
relation=table_open(cache->cc_reloid,AccessShareLock);
16871699

1700+
do
1701+
{
1702+
/*
1703+
* Ok, need to make a lookup in the relation, copy the scankey and
1704+
* fill out any per-call fields. (We must re-do this when
1705+
* retrying, because systable_beginscan scribbles on the scankey.)
1706+
*/
1707+
memcpy(cur_skey,cache->cc_skey,sizeof(ScanKeyData)*cache->cc_nkeys);
1708+
cur_skey[0].sk_argument=v1;
1709+
cur_skey[1].sk_argument=v2;
1710+
cur_skey[2].sk_argument=v3;
1711+
cur_skey[3].sk_argument=v4;
1712+
16881713
scandesc=systable_beginscan(relation,
16891714
cache->cc_indexoid,
16901715
IndexScanOK(cache,cur_skey),
@@ -1695,6 +1720,8 @@ SearchCatCacheList(CatCache *cache,
16951720
/* The list will be ordered iff we are doing an index scan */
16961721
ordered= (scandesc->irel!=NULL);
16971722

1723+
stale= false;
1724+
16981725
while (HeapTupleIsValid(ntp=systable_getnext(scandesc)))
16991726
{
17001727
uint32hashValue;
@@ -1737,9 +1764,32 @@ SearchCatCacheList(CatCache *cache,
17371764
if (!found)
17381765
{
17391766
/* We didn't find a usable entry, so make a new one */
1740-
ct=CatalogCacheCreateEntry(cache,ntp,arguments,
1741-
hashValue,hashIndex,
1742-
false);
1767+
ct=CatalogCacheCreateEntry(cache,ntp,scandesc,NULL,
1768+
hashValue,hashIndex);
1769+
/* upon failure, we must start the scan over */
1770+
if (ct==NULL)
1771+
{
1772+
/*
1773+
* Release refcounts on any items we already had. We dare
1774+
* not try to free them if they're now unreferenced, since
1775+
* an error while doing that would result in the PG_CATCH
1776+
* below doing extra refcount decrements. Besides, we'll
1777+
* likely re-adopt those items in the next iteration, so
1778+
* it's not worth complicating matters to try to get rid
1779+
* of them.
1780+
*/
1781+
foreach(ctlist_item,ctlist)
1782+
{
1783+
ct= (CatCTup*)lfirst(ctlist_item);
1784+
Assert(ct->c_list==NULL);
1785+
Assert(ct->refcount>0);
1786+
ct->refcount--;
1787+
}
1788+
/* Reset ctlist in preparation for new try */
1789+
ctlist=NIL;
1790+
stale= true;
1791+
break;
1792+
}
17431793
}
17441794

17451795
/* Careful here: add entry to ctlist, then bump its refcount */
@@ -1749,6 +1799,7 @@ SearchCatCacheList(CatCache *cache,
17491799
}
17501800

17511801
systable_endscan(scandesc);
1802+
}while (stale);
17521803

17531804
table_close(relation,AccessShareLock);
17541805

@@ -1865,22 +1916,42 @@ ReleaseCatCacheListWithOwner(CatCList *list, ResourceOwner resowner)
18651916
* CatalogCacheCreateEntry
18661917
*Create a new CatCTup entry, copying the given HeapTuple and other
18671918
*supplied data into it. The new entry initially has refcount 0.
1919+
*
1920+
* To create a normal cache entry, ntp must be the HeapTuple just fetched
1921+
* from scandesc, and "arguments" is not used. To create a negative cache
1922+
* entry, pass NULL for ntp and scandesc; then "arguments" is the cache
1923+
* keys to use. In either case, hashValue/hashIndex are the hash values
1924+
* computed from the cache keys.
1925+
*
1926+
* Returns NULL if we attempt to detoast the tuple and observe that it
1927+
* became stale. (This cannot happen for a negative entry.) Caller must
1928+
* retry the tuple lookup in that case.
18681929
*/
18691930
staticCatCTup*
1870-
CatalogCacheCreateEntry(CatCache*cache,HeapTuplentp,Datum*arguments,
1871-
uint32hashValue,IndexhashIndex,
1872-
boolnegative)
1931+
CatalogCacheCreateEntry(CatCache*cache,HeapTuplentp,SysScanDescscandesc,
1932+
Datum*arguments,
1933+
uint32hashValue,IndexhashIndex)
18731934
{
18741935
CatCTup*ct;
18751936
HeapTupledtp;
18761937
MemoryContextoldcxt;
18771938

1878-
/* negative entries have no tuple associated */
18791939
if (ntp)
18801940
{
18811941
inti;
18821942

1883-
Assert(!negative);
1943+
/*
1944+
* The visibility recheck below essentially never fails during our
1945+
* regression tests, and there's no easy way to force it to fail for
1946+
* testing purposes. To ensure we have test coverage for the retry
1947+
* paths in our callers, make debug builds randomly fail about 0.1% of
1948+
* the times through this code path, even when there's no toasted
1949+
* fields.
1950+
*/
1951+
#ifdefUSE_ASSERT_CHECKING
1952+
if (pg_prng_uint32(&pg_global_prng_state) <= (PG_UINT32_MAX /1000))
1953+
returnNULL;
1954+
#endif
18841955

18851956
/*
18861957
* If there are any out-of-line toasted fields in the tuple, expand
@@ -1890,7 +1961,20 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments,
18901961
* something using a slightly stale catcache entry.
18911962
*/
18921963
if (HeapTupleHasExternal(ntp))
1964+
{
18931965
dtp=toast_flatten_tuple(ntp,cache->cc_tupdesc);
1966+
1967+
/*
1968+
* The tuple could become stale while we are doing toast table
1969+
* access (since AcceptInvalidationMessages can run then), so we
1970+
* must recheck its visibility afterwards.
1971+
*/
1972+
if (!systable_recheck_tuple(scandesc,ntp))
1973+
{
1974+
heap_freetuple(dtp);
1975+
returnNULL;
1976+
}
1977+
}
18941978
else
18951979
dtp=ntp;
18961980

@@ -1929,7 +2013,7 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments,
19292013
}
19302014
else
19312015
{
1932-
Assert(negative);
2016+
/* Set up keys for anegative cache entry */
19332017
oldcxt=MemoryContextSwitchTo(CacheMemoryContext);
19342018
ct= (CatCTup*)palloc(sizeof(CatCTup));
19352019

@@ -1951,7 +2035,7 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments,
19512035
ct->c_list=NULL;
19522036
ct->refcount=0;/* for the moment */
19532037
ct->dead= false;
1954-
ct->negative=negative;
2038+
ct->negative=(ntp==NULL);
19552039
ct->hash_value=hashValue;
19562040

19572041
dlist_push_head(&cache->cc_bucket[hashIndex],&ct->cache_elem);

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp