Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commite5a11a8

Browse files
committed
Improve hash method for bitmapsets: some examination of actual outputs
shows that adding a circular shift between words greatly improves thedistribution of hash outputs.
1 parent1f01d59 commite5a11a8

File tree

1 file changed

+21
-7
lines changed

1 file changed

+21
-7
lines changed

‎src/backend/nodes/bitmapset.c

Lines changed: 21 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
* Copyright (c) 2003-2005, PostgreSQL Global Development Group
1515
*
1616
* IDENTIFICATION
17-
* $PostgreSQL: pgsql/src/backend/nodes/bitmapset.c,v 1.8 2005/06/08 23:02:04 tgl Exp $
17+
* $PostgreSQL: pgsql/src/backend/nodes/bitmapset.c,v 1.9 2005/06/15 16:24:07 tgl Exp $
1818
*
1919
*-------------------------------------------------------------------------
2020
*/
@@ -769,22 +769,36 @@ bms_first_member(Bitmapset *a)
769769
*
770770
* Note: we must ensure that any two bitmapsets that are bms_equal() will
771771
* hash to the same value; in practice this means that trailing all-zero
772-
* words cannot affect the result. Longitudinal XOR provides a reasonable
773-
* hash value that has this property.
772+
* words cannot affect the result. The circular-shift-and-XOR hash method
773+
* used here has this property, so long as we work from back to front.
774+
*
775+
* Note: you might wonder why we bother with the circular shift; at first
776+
* glance a straight longitudinal XOR seems as good and much simpler. The
777+
* reason is empirical: this gives a better distribution of hash values on
778+
* the bitmapsets actually generated by the planner. A common way to have
779+
* multiword bitmapsets is "a JOIN b JOIN c JOIN d ...", which gives rise
780+
* to rangetables in which base tables and JOIN nodes alternate; so
781+
* bitmapsets of base table RT indexes tend to use only odd-numbered or only
782+
* even-numbered bits. A straight longitudinal XOR would preserve this
783+
* property, leading to a much smaller set of possible outputs than if
784+
* we include a shift.
774785
*/
775786
uint32
776787
bms_hash_value(constBitmapset*a)
777788
{
778789
bitmapwordresult=0;
779-
intnwords;
780790
intwordnum;
781791

782-
if (a==NULL)
792+
if (a==NULL||a->nwords <=0)
783793
return0;/* All empty sets hash to 0 */
784-
nwords=a->nwords;
785-
for (wordnum=0;wordnum<nwords;wordnum++)
794+
for (wordnum=a->nwords;--wordnum>0; )
786795
{
787796
result ^=a->words[wordnum];
797+
if (result& ((bitmapword)1 << (BITS_PER_BITMAPWORD-1)))
798+
result= (result <<1) |1;
799+
else
800+
result= (result <<1);
788801
}
802+
result ^=a->words[0];
789803
return (uint32)result;
790804
}

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp