- Notifications
You must be signed in to change notification settings - Fork4.9k
Commitf4c7c41
committed
Revert "Optimize order of GROUP BY keys".
This reverts commitdb0d67d andseveral follow-on fixes. The idea of making a cost-based choiceof the order of the sorting columns is not fundamentally unsound,but it requires cost information and data statistics that we don'treally have. For example, relying on procost to distinguish therelative costs of different sort comparators is pretty pointlessso long as most such comparator functions are labeled with cost 1.0.Moreover, estimating the number of comparisons done by Quicksortrequires more than just an estimate of the number of distinct valuesin the input: you also need some idea of the sizes of the largergroups, if you want an estimate that's good to better than a factor ofthree or so. That's data that's often unknown or not very reliable.Worse, to arrive at estimates of the number of calls made to thelower-order-column comparison functions, the code needs to makeestimates of the numbers of distinct values of multiple columns,which are necessarily even less trustworthy than per-column stats.Even if all the inputs are perfectly reliable, the cost algorithmas-implemented cannot offer useful information about how to ordersorting columns beyond the point at which the average group sizeis estimated to drop to 1.Close inspection of the code added bydb0d67d shows that thereare also multiple small bugs. These could have been fixed, butthere's not much point if we don't trust the estimates to beaccurate in-principle.Finally, the changes in cost_sort's behavior made for very largechanges (often a factor of 2 or so) in the cost estimates for allsorting operations, not only those for multi-column GROUP BY.That naturally changes plan choices in many situations, and there'sprecious little evidence to show that the changes are for the better.Given the above doubts about whether the new estimates are reallytrustworthy, it's hard to summon much confidence that these changesare better on the average.Since we're hard up against the release deadline for v15, let'srevert these changes for now. We can always try again later.Note: in v15, I left T_PathKeyInfo in place in nodes.h even thoughit's unreferenced. Removing it would be an ABI break, and it seemsa bit late in the release cycle for that.Discussion:https://postgr.es/m/TYAPR01MB586665EB5FB2C3807E893941F5579@TYAPR01MB5866.jpnprd01.prod.outlook.com1 parentf60eb3f commitf4c7c41
File tree
24 files changed
+455
-1939
lines changed- contrib/postgres_fdw/expected
- doc/src/sgml
- src
- backend
- optimizer
- path
- plan
- util
- utils
- adt
- misc
- include
- nodes
- optimizer
- utils
- test/regress
- expected
- sql
24 files changed
+455
-1939
lines changedLines changed: 9 additions & 6 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
2862 | 2862 |
| |
2863 | 2863 |
| |
2864 | 2864 |
| |
2865 |
| - | |
2866 |
| - | |
2867 |
| - | |
| 2865 | + | |
| 2866 | + | |
| 2867 | + | |
2868 | 2868 |
| |
2869 |
| - | |
2870 |
| - | |
2871 |
| - | |
| 2869 | + | |
| 2870 | + | |
| 2871 | + | |
| 2872 | + | |
| 2873 | + | |
| 2874 | + | |
2872 | 2875 |
| |
2873 | 2876 |
| |
2874 | 2877 |
| |
|
Lines changed: 0 additions & 14 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
5062 | 5062 |
| |
5063 | 5063 |
| |
5064 | 5064 |
| |
5065 |
| - | |
5066 |
| - | |
5067 |
| - | |
5068 |
| - | |
5069 |
| - | |
5070 |
| - | |
5071 |
| - | |
5072 |
| - | |
5073 |
| - | |
5074 |
| - | |
5075 |
| - | |
5076 |
| - | |
5077 |
| - | |
5078 |
| - | |
5079 | 5065 |
| |
5080 | 5066 |
| |
5081 | 5067 |
| |
|
0 commit comments
Comments
(0)