Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commite23bae8

Browse files
committed
Fix up run-time partition pruning's use of relcache's partition data.
The previous coding saved pointers into the partitioned table's relcacheentry, but then closed the relcache entry, causing those pointers tonominally become dangling. Actual trouble would be seen in the fieldonly if a relcache flush occurred mid-query, but that's hardly out ofthe question.While we could fix this by copying all the data in question at querystart, it seems better to just hold the relcache entry open for thewhole query.While at it, improve the handling of support-function lookups: do thatonce per query not once per pruning test. There's still something to bedesired here, in that we fail to exploit the possibility of caching dataacross queries in the fn_extra fields of the relcache's FmgrInfo structs,which could happen if we just used those structs in-place rather thancopying them. However, combining that with the possibility of per-querylookups of cross-type comparison functions seems to require changes in theAPIs of a lot of the pruning support functions, so it's too invasive toconsider as part of this patch. A win would ensue only for complexpartition key data types (e.g. arrays), so it may not be worth thetrouble.David Rowley and Tom LaneDiscussion:https://postgr.es/m/17850.1528755844@sss.pgh.pa.us
1 parente146e4d commite23bae8

File tree

6 files changed

+132
-73
lines changed

6 files changed

+132
-73
lines changed

‎src/backend/executor/execPartition.c

Lines changed: 45 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1357,11 +1357,14 @@ adjust_partition_tlist(List *tlist, TupleConversionMap *map)
13571357
*
13581358
* Functions:
13591359
*
1360-
*ExecSetupPartitionPruneState:
1360+
*ExecCreatePartitionPruneState:
13611361
*Creates the PartitionPruneState required by each of the two pruning
13621362
*functions. Details stored include how to map the partition index
13631363
*returned by the partition pruning code into subplan indexes.
13641364
*
1365+
* ExecDestroyPartitionPruneState:
1366+
*Deletes a PartitionPruneState. Must be called during executor shutdown.
1367+
*
13651368
* ExecFindInitialMatchingSubPlans:
13661369
*Returns indexes of matching subplans. Partition pruning is attempted
13671370
*without any evaluation of expressions containing PARAM_EXEC Params.
@@ -1382,8 +1385,8 @@ adjust_partition_tlist(List *tlist, TupleConversionMap *map)
13821385
*/
13831386

13841387
/*
1385-
*ExecSetupPartitionPruneState
1386-
*Set up the data structure required for calling
1388+
*ExecCreatePartitionPruneState
1389+
*Build the data structure required for calling
13871390
*ExecFindInitialMatchingSubPlans and ExecFindMatchingSubPlans.
13881391
*
13891392
* 'planstate' is the parent plan node's execution state.
@@ -1395,7 +1398,7 @@ adjust_partition_tlist(List *tlist, TupleConversionMap *map)
13951398
* in each PartitionPruneInfo.
13961399
*/
13971400
PartitionPruneState*
1398-
ExecSetupPartitionPruneState(PlanState*planstate,List*partitionpruneinfo)
1401+
ExecCreatePartitionPruneState(PlanState*planstate,List*partitionpruneinfo)
13991402
{
14001403
PartitionPruneState*prunestate;
14011404
PartitionPruningData*prunedata;
@@ -1435,11 +1438,10 @@ ExecSetupPartitionPruneState(PlanState *planstate, List *partitionpruneinfo)
14351438
PartitionPruningData*pprune=&prunedata[i];
14361439
PartitionPruneContext*context=&pprune->context;
14371440
PartitionDescpartdesc;
1438-
Relationrel;
14391441
PartitionKeypartkey;
1440-
ListCell*lc2;
14411442
intpartnatts;
14421443
intn_steps;
1444+
ListCell*lc2;
14431445

14441446
/*
14451447
* We must copy the subplan_map rather than pointing directly to the
@@ -1456,26 +1458,33 @@ ExecSetupPartitionPruneState(PlanState *planstate, List *partitionpruneinfo)
14561458
pprune->present_parts=bms_copy(pinfo->present_parts);
14571459

14581460
/*
1459-
* Grab some info from the table's relcache; lock was already obtained
1460-
* by ExecLockNonLeafAppendTables.
1461+
* We need to hold a pin on the partitioned table's relcache entry so
1462+
* that we can rely on its copies of the table's partition key and
1463+
* partition descriptor. We need not get a lock though; one should
1464+
* have been acquired already by InitPlan or
1465+
* ExecLockNonLeafAppendTables.
14611466
*/
1462-
rel=relation_open(pinfo->reloid,NoLock);
1467+
context->partrel=relation_open(pinfo->reloid,NoLock);
14631468

1464-
partkey=RelationGetPartitionKey(rel);
1465-
partdesc=RelationGetPartitionDesc(rel);
1469+
partkey=RelationGetPartitionKey(context->partrel);
1470+
partdesc=RelationGetPartitionDesc(context->partrel);
1471+
n_steps=list_length(pinfo->pruning_steps);
14661472

14671473
context->strategy=partkey->strategy;
14681474
context->partnatts=partnatts=partkey->partnatts;
1469-
context->partopfamily=partkey->partopfamily;
1470-
context->partopcintype=partkey->partopcintype;
1475+
context->nparts=pinfo->nparts;
1476+
context->boundinfo=partdesc->boundinfo;
14711477
context->partcollation=partkey->partcollation;
14721478
context->partsupfunc=partkey->partsupfunc;
1473-
context->nparts=pinfo->nparts;
1474-
context->boundinfo=partition_bounds_copy(partdesc->boundinfo,partkey);
1479+
1480+
/* We'll look up type-specific support functions as needed */
1481+
context->stepcmpfuncs= (FmgrInfo*)
1482+
palloc0(sizeof(FmgrInfo)*n_steps*partnatts);
1483+
1484+
context->ppccontext=CurrentMemoryContext;
14751485
context->planstate=planstate;
14761486

14771487
/* Initialize expression state for each expression we need */
1478-
n_steps=list_length(pinfo->pruning_steps);
14791488
context->exprstates= (ExprState**)
14801489
palloc0(sizeof(ExprState*)*n_steps*partnatts);
14811490
foreach(lc2,pinfo->pruning_steps)
@@ -1527,14 +1536,32 @@ ExecSetupPartitionPruneState(PlanState *planstate, List *partitionpruneinfo)
15271536
prunestate->execparamids=bms_add_members(prunestate->execparamids,
15281537
pinfo->execparamids);
15291538

1530-
relation_close(rel,NoLock);
1531-
15321539
i++;
15331540
}
15341541

15351542
returnprunestate;
15361543
}
15371544

1545+
/*
1546+
* ExecDestroyPartitionPruneState
1547+
*Release resources at plan shutdown.
1548+
*
1549+
* We don't bother to free any memory here, since the whole executor context
1550+
* will be going away shortly. We do need to release our relcache pins.
1551+
*/
1552+
void
1553+
ExecDestroyPartitionPruneState(PartitionPruneState*prunestate)
1554+
{
1555+
inti;
1556+
1557+
for (i=0;i<prunestate->num_partprunedata;i++)
1558+
{
1559+
PartitionPruningData*pprune=&prunestate->partprunedata[i];
1560+
1561+
relation_close(pprune->context.partrel,NoLock);
1562+
}
1563+
}
1564+
15381565
/*
15391566
* ExecFindInitialMatchingSubPlans
15401567
*Identify the set of subplans that cannot be eliminated by initial

‎src/backend/executor/nodeAppend.c

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -136,8 +136,10 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
136136
/* We may need an expression context to evaluate partition exprs */
137137
ExecAssignExprContext(estate,&appendstate->ps);
138138

139-
prunestate=ExecSetupPartitionPruneState(&appendstate->ps,
140-
node->part_prune_infos);
139+
/* Create the working data structure for pruning. */
140+
prunestate=ExecCreatePartitionPruneState(&appendstate->ps,
141+
node->part_prune_infos);
142+
appendstate->as_prune_state=prunestate;
141143

142144
/* Perform an initial partition prune, if required. */
143145
if (prunestate->do_initial_prune)
@@ -178,8 +180,6 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
178180
*/
179181
if (!prunestate->do_exec_prune)
180182
appendstate->as_valid_subplans=bms_add_range(NULL,0,nplans-1);
181-
182-
appendstate->as_prune_state=prunestate;
183183
}
184184
else
185185
{
@@ -330,6 +330,12 @@ ExecEndAppend(AppendState *node)
330330
*/
331331
for (i=0;i<nplans;i++)
332332
ExecEndNode(appendplans[i]);
333+
334+
/*
335+
* release any resources associated with run-time pruning
336+
*/
337+
if (node->as_prune_state)
338+
ExecDestroyPartitionPruneState(node->as_prune_state);
333339
}
334340

335341
void

‎src/backend/partitioning/partprune.c

Lines changed: 34 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -436,16 +436,20 @@ prune_append_rel_partitions(RelOptInfo *rel)
436436
if (contradictory)
437437
returnNULL;
438438

439+
/* Set up PartitionPruneContext */
439440
context.strategy=rel->part_scheme->strategy;
440441
context.partnatts=rel->part_scheme->partnatts;
441-
context.partopfamily=rel->part_scheme->partopfamily;
442-
context.partopcintype=rel->part_scheme->partopcintype;
443-
context.partcollation=rel->part_scheme->partcollation;
444-
context.partsupfunc=rel->part_scheme->partsupfunc;
445442
context.nparts=rel->nparts;
446443
context.boundinfo=rel->boundinfo;
444+
context.partcollation=rel->part_scheme->partcollation;
445+
context.partsupfunc=rel->part_scheme->partsupfunc;
446+
context.stepcmpfuncs= (FmgrInfo*)palloc0(sizeof(FmgrInfo)*
447+
context.partnatts*
448+
list_length(pruning_steps));
449+
context.ppccontext=CurrentMemoryContext;
447450

448451
/* These are not valid when being called from the planner */
452+
context.partrel=NULL;
449453
context.planstate=NULL;
450454
context.exprstates=NULL;
451455
context.exprhasexecparam=NULL;
@@ -2809,7 +2813,8 @@ perform_pruning_base_step(PartitionPruneContext *context,
28092813
intkeyno,
28102814
nvalues;
28112815
Datumvalues[PARTITION_MAX_KEYS];
2812-
FmgrInfopartsupfunc[PARTITION_MAX_KEYS];
2816+
FmgrInfo*partsupfunc;
2817+
intstateidx;
28132818

28142819
/*
28152820
* There better be the same number of expressions and compare functions.
@@ -2844,7 +2849,6 @@ perform_pruning_base_step(PartitionPruneContext *context,
28442849
if (lc1!=NULL)
28452850
{
28462851
Expr*expr;
2847-
intstateidx;
28482852
Datumdatum;
28492853
boolisnull;
28502854

@@ -2873,19 +2877,25 @@ perform_pruning_base_step(PartitionPruneContext *context,
28732877
returnresult;
28742878
}
28752879

2876-
/*
2877-
* If we're going to need a different comparison function than
2878-
* the one cached in the PartitionKey, we'll need to look up
2879-
* the FmgrInfo.
2880-
*/
2880+
/* Set up the stepcmpfuncs entry, unless we already did */
28812881
cmpfn=lfirst_oid(lc2);
28822882
Assert(OidIsValid(cmpfn));
2883-
if (cmpfn!=context->partsupfunc[keyno].fn_oid)
2884-
fmgr_info(cmpfn,&partsupfunc[keyno]);
2885-
else
2886-
fmgr_info_copy(&partsupfunc[keyno],
2887-
&context->partsupfunc[keyno],
2888-
CurrentMemoryContext);
2883+
if (cmpfn!=context->stepcmpfuncs[stateidx].fn_oid)
2884+
{
2885+
/*
2886+
* If the needed support function is the same one cached
2887+
* in the relation's partition key, copy the cached
2888+
* FmgrInfo. Otherwise (i.e., when we have a cross-type
2889+
* comparison), an actual lookup is required.
2890+
*/
2891+
if (cmpfn==context->partsupfunc[keyno].fn_oid)
2892+
fmgr_info_copy(&context->stepcmpfuncs[stateidx],
2893+
&context->partsupfunc[keyno],
2894+
context->ppccontext);
2895+
else
2896+
fmgr_info_cxt(cmpfn,&context->stepcmpfuncs[stateidx],
2897+
context->ppccontext);
2898+
}
28892899

28902900
values[keyno]=datum;
28912901
nvalues++;
@@ -2896,6 +2906,13 @@ perform_pruning_base_step(PartitionPruneContext *context,
28962906
}
28972907
}
28982908

2909+
/*
2910+
* Point partsupfunc to the entry for the 0th key of this step; the
2911+
* additional support functions, if any, follow consecutively.
2912+
*/
2913+
stateidx=PruneCxtStateIdx(context->partnatts,opstep->step.step_id,0);
2914+
partsupfunc=&context->stepcmpfuncs[stateidx];
2915+
28992916
switch (context->strategy)
29002917
{
29012918
casePARTITION_STRATEGY_HASH:

‎src/backend/utils/cache/relcache.c

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2471,6 +2471,7 @@ RelationClearRelation(Relation relation, bool rebuild)
24712471
keep_tupdesc=equalTupleDescs(relation->rd_att,newrel->rd_att);
24722472
keep_rules=equalRuleLocks(relation->rd_rules,newrel->rd_rules);
24732473
keep_policies=equalRSDesc(relation->rd_rsdesc,newrel->rd_rsdesc);
2474+
/* partkey is immutable once set up, so we can always keep it */
24742475
keep_partkey= (relation->rd_partkey!=NULL);
24752476
keep_partdesc=equalPartitionDescs(relation->rd_partkey,
24762477
relation->rd_partdesc,
@@ -2515,7 +2516,7 @@ RelationClearRelation(Relation relation, bool rebuild)
25152516
SWAPFIELD(Form_pg_class,rd_rel);
25162517
/* ... but actually, we don't have to update newrel->rd_rel */
25172518
memcpy(relation->rd_rel,newrel->rd_rel,CLASS_TUPLE_SIZE);
2518-
/* preserve old tupledesc andrules if no logical change */
2519+
/* preserve old tupledesc,rules, policies if no logical change */
25192520
if (keep_tupdesc)
25202521
SWAPFIELD(TupleDesc,rd_att);
25212522
if (keep_rules)
@@ -2529,13 +2530,12 @@ RelationClearRelation(Relation relation, bool rebuild)
25292530
SWAPFIELD(Oid,rd_toastoid);
25302531
/* pgstat_info must be preserved */
25312532
SWAPFIELD(structPgStat_TableStatus*,pgstat_info);
2532-
/*partition key must be preserved,ifwe have one */
2533+
/*preserve old partitioning infoifno logical change */
25332534
if (keep_partkey)
25342535
{
25352536
SWAPFIELD(PartitionKey,rd_partkey);
25362537
SWAPFIELD(MemoryContext,rd_partkeycxt);
25372538
}
2538-
/* preserve old partdesc if no logical change */
25392539
if (keep_partdesc)
25402540
{
25412541
SWAPFIELD(PartitionDesc,rd_partdesc);

‎src/include/executor/execPartition.h

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -208,8 +208,9 @@ extern HeapTuple ConvertPartitionTupleSlot(TupleConversionMap *map,
208208
TupleTableSlot**p_my_slot);
209209
externvoidExecCleanupTupleRouting(ModifyTableState*mtstate,
210210
PartitionTupleRouting*proute);
211-
externPartitionPruneState*ExecSetupPartitionPruneState(PlanState*planstate,
212-
List*partitionpruneinfo);
211+
externPartitionPruneState*ExecCreatePartitionPruneState(PlanState*planstate,
212+
List*partitionpruneinfo);
213+
externvoidExecDestroyPartitionPruneState(PartitionPruneState*prunestate);
213214
externBitmapset*ExecFindMatchingSubPlans(PartitionPruneState*prunestate);
214215
externBitmapset*ExecFindInitialMatchingSubPlans(PartitionPruneState*prunestate,
215216
intnsubplans);

‎src/include/partitioning/partprune.h

Lines changed: 37 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -20,49 +20,57 @@
2020

2121
/*
2222
* PartitionPruneContext
23+
*Stores information needed at runtime for pruning computations
24+
*related to a single partitioned table.
2325
*
24-
* Information about a partitioned table needed to perform partition pruning.
26+
* partrelRelcache pointer for the partitioned table,
27+
*if we have it open (else NULL).
28+
* strategyPartition strategy, e.g. LIST, RANGE, HASH.
29+
* partnattsNumber of columns in the partition key.
30+
* npartsNumber of partitions in this partitioned table.
31+
* boundinfoPartition boundary info for the partitioned table.
32+
* partcollationArray of partnatts elements, storing the collations of the
33+
*partition key columns.
34+
* partsupfuncArray of FmgrInfos for the comparison or hashing functions
35+
*associated with the partition keys (partnatts elements).
36+
*(This points into the partrel's partition key, typically.)
37+
* stepcmpfuncsArray of FmgrInfos for the comparison or hashing function
38+
*for each pruning step and partition key.
39+
* ppccontextMemory context holding this PartitionPruneContext's
40+
*subsidiary data, such as the FmgrInfos.
41+
* planstatePoints to the parent plan node's PlanState when called
42+
*during execution; NULL when called from the planner.
43+
* exprstatesArray of ExprStates, indexed as per PruneCtxStateIdx; one
44+
*for each partition key in each pruning step. Allocated if
45+
*planstate is non-NULL, otherwise NULL.
46+
* exprhasexecparamArray of bools, each true if corresponding 'exprstate'
47+
*expression contains any PARAM_EXEC Params. (Can be NULL
48+
*if planstate is NULL.)
49+
* evalexecparamsTrue if it's safe to evaluate PARAM_EXEC Params.
2550
*/
2651
typedefstructPartitionPruneContext
2752
{
28-
/* Partition key information */
53+
Relationpartrel;
2954
charstrategy;
3055
intpartnatts;
31-
Oid*partopfamily;
32-
Oid*partopcintype;
33-
Oid*partcollation;
34-
FmgrInfo*partsupfunc;
35-
36-
/* Number of partitions */
3756
intnparts;
38-
39-
/* Partition boundary info */
4057
PartitionBoundInfoboundinfo;
41-
42-
/*
43-
* This will be set when the context is used from the executor, to allow
44-
* Params to be evaluated.
45-
*/
58+
Oid*partcollation;
59+
FmgrInfo*partsupfunc;
60+
FmgrInfo*stepcmpfuncs;
61+
MemoryContextppccontext;
4662
PlanState*planstate;
47-
48-
/*
49-
* Array of ExprStates, indexed as per PruneCtxStateIdx; one for each
50-
* partkey in each pruning step. Allocated if planstate is non-NULL,
51-
* otherwise NULL.
52-
*/
5363
ExprState**exprstates;
54-
55-
/*
56-
* Similar array of flags, each true if corresponding 'exprstate'
57-
* expression contains any PARAM_EXEC Params. (Can be NULL if planstate
58-
* is NULL.)
59-
*/
6064
bool*exprhasexecparam;
61-
62-
/* true if it's safe to evaluate PARAM_EXEC Params */
6365
boolevalexecparams;
6466
}PartitionPruneContext;
6567

68+
/*
69+
* PruneCxtStateIdx() computes the correct index into the stepcmpfuncs[],
70+
* exprstates[] and exprhasexecparam[] arrays for step step_id and
71+
* partition key column keyno. (Note: there is code that assumes the
72+
* entries for a given step are sequential, so this is not chosen freely.)
73+
*/
6674
#definePruneCxtStateIdx(partnatts,step_id,keyno) \
6775
((partnatts) * (step_id) + (keyno))
6876

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp