NotificationsYou must be signed in to change notification settings
Fork5
Star27

Commite6a30a8

committed

Improve cost estimation for aggregates and window functions.

The previous coding failed to account properly for the costs of evaluatingthe input expressions of aggregates and window functions, as seen in arecent gripe from Claudio Freire. (I said at the time that it wasn'tcounting these costs at all; but on closer inspection, it was effectivelycharging these costs once per output tuple. That is completely wrong foraggregates, and not exactly right for window functions either.)There was also a hard-wired assumption that aggregates and window functionshad procost 1.0, which is now fixed to respect the actual cataloged costs.The costing of WindowAgg is still pretty bogus, since it doesn't try toestimate the effects of spilling data to disk, but that seems like aseparate issue.

1 parentf6322b3 commite6a30a8Copy full SHA for e6a30a8

File tree

11 files changed

+199

-107

lines changed

src
- backend/optimizer
  - path
    - costsize.c
  - plan
  - prep
    - prepunion.c
  - util
    - clauses.c
    - pathnode.c
- include
  - nodes
    - relation.h
  - optimizer

11 files changed

+199

-107

lines changed

`‎src/backend/optimizer/path/costsize.c‎`

Lines changed: 85 additions & 33 deletions

Original file line number	Diff line number	Diff line change
`@@ -1335,25 +1335,40 @@ cost_material(Path *path,`
`1335`	`1335`	`*Determines and returns the cost of performing an Agg plan node,`
`1336`	`1336`	`*including the cost of its input.`
`1337`	`1337`	`*`
	`1338`	`+ * aggcosts can be NULL when there are no actual aggregate functions (i.e.,`
	`1339`	`+ * we are using a hashed Agg node just to do grouping).`
	`1340`	`+ *`
`1338`	`1341`	`* Note: when aggstrategy == AGG_SORTED, caller must ensure that input costs`
`1339`	`1342`	`* are for appropriately-sorted input.`
`1340`	`1343`	`*/`
`1341`	`1344`	`void`
`1342`	`1345`	`cost_agg(Pathpath,PlannerInforoot,`
`1343`		`-AggStrategyaggstrategy,intnumAggs,`
	`1346`	`+AggStrategyaggstrategy,constAggClauseCosts*aggcosts,`
`1344`	`1347`	`intnumGroupCols,doublenumGroups,`
`1345`	`1348`	`Costinput_startup_cost,Costinput_total_cost,`
`1346`	`1349`	`doubleinput_tuples)`
`1347`	`1350`	`{`
`1348`	`1351`	`Coststartup_cost;`
`1349`	`1352`	`Costtotal_cost;`
	`1353`	`+AggClauseCostsdummy_aggcosts;`
	`1354`	`+`
	`1355`	`+/* Use all-zero per-aggregate costs if NULL is passed */`
	`1356`	`+if (aggcosts==NULL)`
	`1357`	`+{`
	`1358`	`+Assert(aggstrategy==AGG_HASHED);`
	`1359`	`+MemSet(&dummy_aggcosts,0,sizeof(AggClauseCosts));`
	`1360`	`+aggcosts=&dummy_aggcosts;`
	`1361`	`+}`
`1350`	`1362`
`1351`	`1363`	`/*`
`1352`		`- * We charge one cpu_operator_cost per aggregate function per input tuple,`
`1353`		`- * and another one per output tuple (corresponding to transfn and finalfn`
`1354`		`- * calls respectively). If we are grouping, we charge an additional`
`1355`		`- * cpu_operator_cost per grouping column per input tuple for grouping`
`1356`		`- * comparisons.`
	`1364`	`+ * The transCost.per_tuple component of aggcosts should be charged once`
	`1365`	`+ * per input tuple, corresponding to the costs of evaluating the aggregate`
	`1366`	`+ * transfns and their input expressions (with any startup cost of course`
	`1367`	`+ * charged but once). The finalCost component is charged once per output`
	`1368`	`+ * tuple, corresponding to the costs of evaluating the finalfns.`
	`1369`	`+ *`
	`1370`	`+ * If we are grouping, we charge an additional cpu_operator_cost per`
	`1371`	`+ * grouping column per input tuple for grouping comparisons.`
`1357`	`1372`	`*`
`1358`	`1373`	`* We will produce a single output tuple if not grouping, and a tuple per`
`1359`	`1374`	`* group otherwise. We charge cpu_tuple_cost for each output tuple.`
`@@ -1366,15 +1381,13 @@ cost_agg(Path path, PlannerInfo root,`
`1366`	`1381`	`* there's roundoff error we might do the wrong thing. So be sure that`
`1367`	`1382`	`* the computations below form the same intermediate values in the same`
`1368`	`1383`	`* order.`
`1369`		`- *`
`1370`		`- * Note: ideally we should use the pg_proc.procost costs of each`
`1371`		`- * aggregate's component functions, but for now that seems like an`
`1372`		`- * excessive amount of work.`
`1373`	`1384`	`*/`
`1374`	`1385`	`if (aggstrategy==AGG_PLAIN)`
`1375`	`1386`	`{`
`1376`	`1387`	`startup_cost=input_total_cost;`
`1377`		`-startup_cost+=cpu_operator_cost* (input_tuples+1)*numAggs;`
	`1388`	`+startup_cost+=aggcosts->transCost.startup;`
	`1389`	`+startup_cost+=aggcosts->transCost.per_tuple*input_tuples;`
	`1390`	`+startup_cost+=aggcosts->finalCost;`
`1378`	`1391`	`/* we aren't grouping */`
`1379`	`1392`	`total_cost=startup_cost+cpu_tuple_cost;`
`1380`	`1393`	`}`
`@@ -1384,19 +1397,21 @@ cost_agg(Path path, PlannerInfo root,`
`1384`	`1397`	`startup_cost=input_startup_cost;`
`1385`	`1398`	`total_cost=input_total_cost;`
`1386`	`1399`	`/* calcs phrased this way to match HASHED case, see note above */`
`1387`		`-total_cost+=cpu_operator_costinput_tuplesnumGroupCols;`
`1388`		`-total_cost+=cpu_operator_costinput_tuplesnumAggs;`
`1389`		`-total_cost+=cpu_operator_costnumGroupsnumAggs;`
	`1400`	`+total_cost+=aggcosts->transCost.startup;`
	`1401`	`+total_cost+=aggcosts->transCost.per_tuple*input_tuples;`
	`1402`	`+total_cost+= (cpu_operator_costnumGroupCols)input_tuples;`
	`1403`	`+total_cost+=aggcosts->finalCost*numGroups;`
`1390`	`1404`	`total_cost+=cpu_tuple_cost*numGroups;`
`1391`	`1405`	`}`
`1392`	`1406`	`else`
`1393`	`1407`	`{`
`1394`	`1408`	`/* must be AGG_HASHED */`
`1395`	`1409`	`startup_cost=input_total_cost;`
`1396`		`-startup_cost+=cpu_operator_costinput_tuplesnumGroupCols;`
`1397`		`-startup_cost+=cpu_operator_costinput_tuplesnumAggs;`
	`1410`	`+startup_cost+=aggcosts->transCost.startup;`
	`1411`	`+startup_cost+=aggcosts->transCost.per_tuple*input_tuples;`
	`1412`	`+startup_cost+= (cpu_operator_costnumGroupCols)input_tuples;`
`1398`	`1413`	`total_cost=startup_cost;`
`1399`		`-total_cost+=cpu_operator_costnumGroupsnumAggs;`
	`1414`	`+total_cost+=aggcosts->finalCost*numGroups;`
`1400`	`1415`	`total_cost+=cpu_tuple_cost*numGroups;`
`1401`	`1416`	`}`
`1402`	`1417`
`@@ -1413,25 +1428,53 @@ cost_agg(Path path, PlannerInfo root,`
`1413`	`1428`	`*/`
`1414`	`1429`	`void`
`1415`	`1430`	`cost_windowagg(Pathpath,PlannerInforoot,`
`1416`		`-intnumWindowFuncs,intnumPartCols,intnumOrderCols,`
	`1431`	`+List*windowFuncs,intnumPartCols,intnumOrderCols,`
`1417`	`1432`	`Costinput_startup_cost,Costinput_total_cost,`
`1418`	`1433`	`doubleinput_tuples)`
`1419`	`1434`	`{`
`1420`	`1435`	`Coststartup_cost;`
`1421`	`1436`	`Costtotal_cost;`
	`1437`	`+ListCell*lc;`
`1422`	`1438`
`1423`	`1439`	`startup_cost=input_startup_cost;`
`1424`	`1440`	`total_cost=input_total_cost;`
`1425`	`1441`
`1426`	`1442`	`/*`
`1427`		`- * We charge one cpu_operator_cost per window function per tuple (often a`
`1428`		`- * drastic underestimate, but without a way to gauge how many tuples the`
`1429`		`- * window function will fetch, it's hard to do better). We also charge`
`1430`		`- * cpu_operator_cost per grouping column per tuple for grouping`
`1431`		`- * comparisons, plus cpu_tuple_cost per tuple for general overhead.`
`1432`		`- */`
`1433`		`-total_cost+=cpu_operator_costinput_tuplesnumWindowFuncs;`
`1434`		`-total_cost+=cpu_operator_costinput_tuples (numPartCols+numOrderCols);`
	`1443`	`+ * Window functions are assumed to cost their stated execution cost, plus`
	`1444`	`+ * the cost of evaluating their input expressions, per tuple. Since they`
	`1445`	`+ * may in fact evaluate their inputs at multiple rows during each cycle,`
	`1446`	`+ * this could be a drastic underestimate; but without a way to know how`
	`1447`	`+ * many rows the window function will fetch, it's hard to do better. In`
	`1448`	`+ * any case, it's a good estimate for all the built-in window functions,`
	`1449`	`+ * so we'll just do this for now.`
	`1450`	`+ */`
	`1451`	`+foreach(lc,windowFuncs)`
	`1452`	`+{`
	`1453`	`+WindowFuncwfunc= (WindowFunc)lfirst(lc);`
	`1454`	`+Costwfunccost;`
	`1455`	`+QualCostargcosts;`
	`1456`	`+`
	`1457`	`+Assert(IsA(wfunc,WindowFunc));`
	`1458`	`+`
	`1459`	`+wfunccost=get_func_cost(wfunc->winfnoid)*cpu_operator_cost;`
	`1460`	`+`
	`1461`	`+/* also add the input expressions' cost to per-input-row costs */`
	`1462`	`+cost_qual_eval_node(&argcosts, (Node*)wfunc->args,root);`
	`1463`	`+startup_cost+=argcosts.startup;`
	`1464`	`+wfunccost+=argcosts.per_tuple;`
	`1465`	`+`
	`1466`	`+total_cost+=wfunccost*input_tuples;`
	`1467`	`+}`
	`1468`	`+`
	`1469`	`+/*`
	`1470`	`+ * We also charge cpu_operator_cost per grouping column per tuple for`
	`1471`	`+ * grouping comparisons, plus cpu_tuple_cost per tuple for general`
	`1472`	`+ * overhead.`
	`1473`	`+ *`
	`1474`	`+ * XXX this neglects costs of spooling the data to disk when it overflows`
	`1475`	`+ * work_mem. Sooner or later that should get accounted for.`
	`1476`	`+ */`
	`1477`	`+total_cost+=cpu_operator_cost* (numPartCols+numOrderCols)*input_tuples;`
`1435`	`1478`	`total_cost+=cpu_tuple_cost*input_tuples;`
`1436`	`1479`
`1437`	`1480`	`path->startup_cost=startup_cost;`
`@@ -2640,17 +2683,12 @@ cost_qual_eval_walker(Node node, cost_qual_eval_context context)`
`2640`	`2683`	`* Vars and Consts are charged zero, and so are boolean operators (AND,`
`2641`	`2684`	`* OR, NOT). Simplistic, but a lot better than no model at all.`
`2642`	`2685`	`*`
`2643`		`- * Note that Aggref and WindowFunc nodes are (and should be) treated like`
`2644`		`- * Vars --- whatever execution cost they have is absorbed into`
`2645`		`- * plan-node-specific costing.As far as expression evaluation is`
`2646`		`- * concerned they're just like Vars.`
`2647`		`- *`
`2648`	`2686`	`* Should we try to account for the possibility of short-circuit`
`2649`	`2687`	`* evaluation of AND/OR? Probably not, because that would make the`
`2650`	`2688`	`* results depend on the clause ordering, and we are not in any position`
`2651`	`2689`	`* to expect that the current ordering of the clauses is the one that's`
`2652`		`- * going to end up being used.(Is it worth applying order_qual_clauses`
`2653`		`- *much earlier in the planning processtofix this?)`
	`2690`	`+ * going to end up being used. The above per-RestrictInfo caching would`
	`2691`	`+ *not mix well with tryingtore-order clauses anyway.`
`2654`	`2692`	`*/`
`2655`	`2693`	`if (IsA(node,FuncExpr))`
`2656`	`2694`	`{`
`@@ -2679,6 +2717,20 @@ cost_qual_eval_walker(Node node, cost_qual_eval_context context)`
`2679`	`2717`	`context->total.per_tuple+=get_func_cost(saop->opfuncid)*`
`2680`	`2718`	`cpu_operator_costestimate_array_length(arraynode)0.5;`
`2681`	`2719`	`}`
	`2720`	`+elseif (IsA(node,Aggref)\|\|`
	`2721`	`+IsA(node,WindowFunc))`
	`2722`	`+{`
	`2723`	`+/*`
	`2724`	`+ * Aggref and WindowFunc nodes are (and should be) treated like Vars,`
	`2725`	`+ * ie, zero execution cost in the current model, because they behave`
	`2726`	`+ * essentially like Vars in execQual.c. We disregard the costs of`
	`2727`	`+ * their input expressions for the same reason. The actual execution`
	`2728`	`+ * costs of the aggregate/window functions and their arguments have to`
	`2729`	`+ * be factored into plan-node-specific costing of the Agg or WindowAgg`
	`2730`	`+ * plan node.`
	`2731`	`+ */`
	`2732`	`+return false;/* don't recurse into children */`
	`2733`	`+}`
`2682`	`2734`	`elseif (IsA(node,CoerceViaIO))`
`2683`	`2735`	`{`
`2684`	`2736`	`CoerceViaIOiocoerce= (CoerceViaIO)node;`

`‎src/backend/optimizer/plan/createplan.c‎`

Lines changed: 6 additions & 6 deletions

Original file line number	Diff line number	Diff line change
`@@ -939,11 +939,11 @@ create_unique_plan(PlannerInfo root, UniquePath best_path)`
`939`	`939`	`build_relation_tlist(best_path->path.parent),`
`940`	`940`	`NIL,`
`941`	`941`	`AGG_HASHED,`
	`942`	`+NULL,`
`942`	`943`	`numGroupCols,`
`943`	`944`	`groupColIdx,`
`944`	`945`	`groupOperators,`
`945`	`946`	`numGroups,`
`946`		`-0,`
`947`	`947`	`subplan);`
`948`	`948`	`}`
`949`	`949`	`else`
`@@ -3841,9 +3841,9 @@ materialize_finished_plan(Plan *subplan)`
`3841`	`3841`
`3842`	`3842`	`Agg*`
`3843`	`3843`	`make_agg(PlannerInforoot,Listtlist,List*qual,`
`3844`		`-AggStrategyaggstrategy,`
	`3844`	`+AggStrategyaggstrategy,constAggClauseCosts*aggcosts,`
`3845`	`3845`	`intnumGroupCols,AttrNumbergrpColIdx,OidgrpOperators,`
`3846`		`-longnumGroups,intnumAggs,`
	`3846`	`+longnumGroups,`
`3847`	`3847`	`Plan*lefttree)`
`3848`	`3848`	`{`
`3849`	`3849`	`Agg*node=makeNode(Agg);`
`@@ -3859,7 +3859,7 @@ make_agg(PlannerInfo root, List tlist, List *qual,`
`3859`	`3859`
`3860`	`3860`	`copy_plan_costsize(plan,lefttree);/* only care about copying size */`
`3861`	`3861`	`cost_agg(&agg_path,root,`
`3862`		`-aggstrategy,numAggs,`
	`3862`	`+aggstrategy,aggcosts,`
`3863`	`3863`	`numGroupCols,numGroups,`
`3864`	`3864`	`lefttree->startup_cost,`
`3865`	`3865`	`lefttree->total_cost,`
`@@ -3907,7 +3907,7 @@ make_agg(PlannerInfo root, List tlist, List *qual,`
`3907`	`3907`
`3908`	`3908`	`WindowAgg*`
`3909`	`3909`	`make_windowagg(PlannerInforoot,Listtlist,`
`3910`		`-intnumWindowFuncs,Indexwinref,`
	`3910`	`+List*windowFuncs,Indexwinref,`
`3911`	`3911`	`intpartNumCols,AttrNumberpartColIdx,OidpartOperators,`
`3912`	`3912`	`intordNumCols,AttrNumberordColIdx,OidordOperators,`
`3913`	`3913`	`intframeOptions,NodestartOffset,NodeendOffset,`
`@@ -3931,7 +3931,7 @@ make_windowagg(PlannerInfo root, List tlist,`
`3931`	`3931`
`3932`	`3932`	`copy_plan_costsize(plan,lefttree);/* only care about copying size */`
`3933`	`3933`	`cost_windowagg(&windowagg_path,root,`
`3934`		`-numWindowFuncs,partNumCols,ordNumCols,`
	`3934`	`+windowFuncs,partNumCols,ordNumCols,`
`3935`	`3935`	`lefttree->startup_cost,`
`3936`	`3936`	`lefttree->total_cost,`
`3937`	`3937`	`lefttree->plan_rows);`

`‎src/backend/optimizer/plan/planagg.c‎`

Lines changed: 5 additions & 3 deletions

Original file line number	Diff line number	Diff line change
`@@ -187,11 +187,13 @@ preprocess_minmax_aggregates(PlannerInfo root, List tlist)`
`187`	`187`	`* Should we skip even trying to build the standard plan, if`
`188`	`188`	`* preprocess_minmax_aggregates succeeds?`
`189`	`189`	`*`
`190`		`- * We are passed the preprocessed tlist, as well as the best path devised for`
	`190`	`+ * We are passed the preprocessed tlist, as well as the estimated costs for`
	`191`	`+ * doing the aggregates the regular way, and the best path devised for`
`191`	`192`	`* computing the input of a standard Agg node.`
`192`	`193`	`*/`
`193`	`194`	`Plan*`
`194`		`-optimize_minmax_aggregates(PlannerInforoot,Listtlist,Path*best_path)`
	`195`	`+optimize_minmax_aggregates(PlannerInforoot,Listtlist,`
	`196`	`+constAggClauseCostsaggcosts,Pathbest_path)`
`195`	`197`	`{`
`196`	`198`	`Query*parse=root->parse;`
`197`	`199`	`Costtotal_cost;`
`@@ -221,7 +223,7 @@ optimize_minmax_aggregates(PlannerInfo root, List tlist, Path *best_path)`
`221`	`223`	`total_cost+=mminfo->pathcost;`
`222`	`224`	`}`
`223`	`225`
`224`		`-cost_agg(&agg_p,root,AGG_PLAIN,list_length(root->minmax_aggs),`
	`226`	`+cost_agg(&agg_p,root,AGG_PLAIN,aggcosts,`
`225`	`227`	`0,0,`
`226`	`228`	`best_path->startup_cost,best_path->total_cost,`
`227`	`229`	`best_path->parent->rows);`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commite6a30a8

File tree

11 files changed

11 files changed

`‎src/backend/optimizer/path/costsize.c‎`

`‎src/backend/optimizer/plan/createplan.c‎`

`‎src/backend/optimizer/plan/planagg.c‎`

0 commit comments