Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit044c99b

Browse files
committed
Use query collation, not column's collation, while examining statistics.
Commit5e09280 changed the planner so that, instead of blindly usingDEFAULT_COLLATION_OID when invoking operators for selectivity estimation,it would use the collation of the column whose statistics we'reconsidering. This was recognized as still being not quite the rightthing, but it seemed like a good incremental improvement. However,shortly thereafter we introduced nondeterministic collations, and thatcreates cases where operators can fail if they're passed the wrongcollation. We don't want planning to fail in cases where the query itselfwould work, so this means that we *must* use the query's collation wheninvoking operators for estimation purposes.The only real problem this creates is in ineq_histogram_selectivity, wherethe binary search might produce a garbage answer if we perform comparisonsusing a different collation than the column's histogram is ordered with.However, when the query's collation is significantly different from thecolumn's default collation, the estimate we previously generated would bepretty irrelevant anyway; so it's not clear that this will result innoticeably worse estimates in practice. (A follow-on patch will improvethis situation in HEAD, but it seems too invasive for back-patch.)The patch requires changing the signatures of mcv_selectivity and alliedfunctions, which are exported and very possibly are used by extensions.In HEAD, I just did that, but an API/ABI break of this sort isn'tacceptable in stable branches. Therefore, in v12 the patch introduces"mcv_selectivity_ext" and so on, with signatures matching HEAD, and makesthe old functions into wrappers that assume DEFAULT_COLLATION_OID shouldbe used. That does not match the prior behavior, but it should avoid riskof failure in most cases. (In practice, I think most extension datatypesaren't collation-aware, so the change probably doesn't matter to them.)Per report from James Lucas. Back-patch to v12 where the problem wasintroduced.Discussion:https://postgr.es/m/CAAFmbbOvfi=wMM=3qRsPunBSLb8BFREno2oOzSBS=mzfLPKABw@mail.gmail.com
1 parentf0d2c65 commit044c99b

File tree

5 files changed

+100
-85
lines changed

5 files changed

+100
-85
lines changed

‎contrib/ltree/ltree_op.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -582,7 +582,7 @@ ltreeparentsel(PG_FUNCTION_ARGS)
582582
doubleselec;
583583

584584
/* Use generic restriction selectivity logic, with default 0.001. */
585-
selec=generic_restriction_selectivity(root,operator,
585+
selec=generic_restriction_selectivity(root,operator,InvalidOid,
586586
args,varRelid,
587587
0.001);
588588

‎src/backend/utils/adt/like_support.c

Lines changed: 15 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,7 @@ static Pattern_Prefix_Status pattern_fixed_prefix(Const *patt,
9292
staticSelectivityprefix_selectivity(PlannerInfo*root,
9393
VariableStatData*vardata,
9494
Oideqopr,Oidltopr,Oidgeopr,
95+
Oidcollation,
9596
Const*prefixcon);
9697
staticSelectivitylike_selectivity(constchar*patt,intpattlen,
9798
boolcase_insensitive);
@@ -534,12 +535,6 @@ patternsel_common(PlannerInfo *root,
534535
* something binary-compatible but different.)We can use it to identify
535536
* the comparison operators and the required type of the comparison
536537
* constant, much as in match_pattern_prefix().
537-
*
538-
* NOTE: this logic does not consider collations. Ideally we'd force use
539-
* of "C" collation, but since ANALYZE only generates statistics for the
540-
* column's specified collation, we have little choice but to use those.
541-
* But our results are so approximate anyway that it probably hardly
542-
* matters.
543538
*/
544539
vartype=vardata.vartype;
545540

@@ -622,7 +617,7 @@ patternsel_common(PlannerInfo *root,
622617
/*
623618
* Pattern specifies an exact match, so estimate as for '='
624619
*/
625-
result=var_eq_const(&vardata,eqopr,prefix->constvalue,
620+
result=var_eq_const(&vardata,eqopr,collation,prefix->constvalue,
626621
false, true, false);
627622
}
628623
else
@@ -654,7 +649,8 @@ patternsel_common(PlannerInfo *root,
654649
opfuncid=get_opcode(oprid);
655650
fmgr_info(opfuncid,&opproc);
656651

657-
selec=histogram_selectivity(&vardata,&opproc,constval, true,
652+
selec=histogram_selectivity(&vardata,&opproc,collation,
653+
constval, true,
658654
10,1,&hist_size);
659655

660656
/* If not at least 100 entries, use the heuristic method */
@@ -666,6 +662,7 @@ patternsel_common(PlannerInfo *root,
666662
if (pstatus==Pattern_Prefix_Partial)
667663
prefixsel=prefix_selectivity(root,&vardata,
668664
eqopr,ltopr,geopr,
665+
collation,
669666
prefix);
670667
else
671668
prefixsel=1.0;
@@ -698,7 +695,8 @@ patternsel_common(PlannerInfo *root,
698695
* directly to the result selectivity. Also add up the total fraction
699696
* represented by MCV entries.
700697
*/
701-
mcv_selec=mcv_selectivity(&vardata,&opproc,constval, true,
698+
mcv_selec=mcv_selectivity(&vardata,&opproc,collation,
699+
constval, true,
702700
&sumcommon);
703701

704702
/*
@@ -1196,7 +1194,7 @@ pattern_fixed_prefix(Const *patt, Pattern_Type ptype, Oid collation,
11961194
* population represented by the histogram --- the caller must fold this
11971195
* together with info about MCVs and NULLs.
11981196
*
1199-
* We use thespecified btreecomparison operators to do the estimation.
1197+
* We use thegivencomparison operators and collation to do the estimation.
12001198
* The given variable and Const must be of the associated datatype(s).
12011199
*
12021200
* XXX Note: we make use of the upper bound to estimate operator selectivity
@@ -1207,11 +1205,11 @@ pattern_fixed_prefix(Const *patt, Pattern_Type ptype, Oid collation,
12071205
staticSelectivity
12081206
prefix_selectivity(PlannerInfo*root,VariableStatData*vardata,
12091207
Oideqopr,Oidltopr,Oidgeopr,
1208+
Oidcollation,
12101209
Const*prefixcon)
12111210
{
12121211
Selectivityprefixsel;
12131212
FmgrInfoopproc;
1214-
AttStatsSlotsslot;
12151213
Const*greaterstrcon;
12161214
Selectivityeq_sel;
12171215

@@ -1220,6 +1218,7 @@ prefix_selectivity(PlannerInfo *root, VariableStatData *vardata,
12201218

12211219
prefixsel=ineq_histogram_selectivity(root,vardata,
12221220
&opproc, true, true,
1221+
collation,
12231222
prefixcon->constvalue,
12241223
prefixcon->consttype);
12251224

@@ -1229,27 +1228,18 @@ prefix_selectivity(PlannerInfo *root, VariableStatData *vardata,
12291228
returnDEFAULT_MATCH_SEL;
12301229
}
12311230

1232-
/*-------
1233-
* If we can create a string larger than the prefix, say
1234-
* "x < greaterstr". We try to generate the string referencing the
1235-
* collation of the var's statistics, but if that's not available,
1236-
* use DEFAULT_COLLATION_OID.
1237-
*-------
1231+
/*
1232+
* If we can create a string larger than the prefix, say "x < greaterstr".
12381233
*/
1239-
if (HeapTupleIsValid(vardata->statsTuple)&&
1240-
get_attstatsslot(&sslot,vardata->statsTuple,
1241-
STATISTIC_KIND_HISTOGRAM,InvalidOid,0))
1242-
/* sslot.stacoll is set up */ ;
1243-
else
1244-
sslot.stacoll=DEFAULT_COLLATION_OID;
12451234
fmgr_info(get_opcode(ltopr),&opproc);
1246-
greaterstrcon=make_greater_string(prefixcon,&opproc,sslot.stacoll);
1235+
greaterstrcon=make_greater_string(prefixcon,&opproc,collation);
12471236
if (greaterstrcon)
12481237
{
12491238
Selectivitytopsel;
12501239

12511240
topsel=ineq_histogram_selectivity(root,vardata,
12521241
&opproc, false, false,
1242+
collation,
12531243
greaterstrcon->constvalue,
12541244
greaterstrcon->consttype);
12551245

@@ -1278,7 +1268,7 @@ prefix_selectivity(PlannerInfo *root, VariableStatData *vardata,
12781268
* probably off the end of the histogram, and thus we probably got a very
12791269
* small estimate from the >= condition; so we still need to clamp.
12801270
*/
1281-
eq_sel=var_eq_const(vardata,eqopr,prefixcon->constvalue,
1271+
eq_sel=var_eq_const(vardata,eqopr,collation,prefixcon->constvalue,
12821272
false, true, false);
12831273

12841274
prefixsel=Max(prefixsel,eq_sel);

‎src/backend/utils/adt/network_selfuncs.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -137,7 +137,8 @@ networksel(PG_FUNCTION_ARGS)
137137
* by MCV entries.
138138
*/
139139
fmgr_info(get_opcode(operator),&proc);
140-
mcv_selec=mcv_selectivity(&vardata,&proc,constvalue,varonleft,
140+
mcv_selec=mcv_selectivity(&vardata,&proc,InvalidOid,
141+
constvalue,varonleft,
141142
&sumcommon);
142143

143144
/*

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp