Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitd4c62a6

Browse files
committed
Make simplehash.h grow hashtable in additional cases.
Increase the size when either the distance between actual and optimalslot grows too large, or when too many subsequent entries would haveto be moved.This addresses reports that the simplehash performed, sometimesconsiderably, worse than dynahash.The reason turned out to be that insertions into the hashtable where,due to the use of parallel query, in effect done from anotherhashtable, in hash-value order. If the target hashtable, due tomis-estimation, was sized a lot smaller than the source table(s) thatlead to very imbalanced tables; a lot of entries in many close-bybuckets from the source tables were inserted into a single, wider,bucket on the target table. As the growth factor was solely computedbased on the fillfactor, the performance of the table decreasedfurther and further.b81b5a9 was an attempt to address this problem for hashaggregates (but not for bitmap scans), but it turns out that thecurrent method of mixing hash values often actually leaves neighboringhash-values close to each other, just in different value range. Itmight be worth revisiting that independently of the performance issuesaddressed in this patch..To address that problem resize tables in two additional cases: Firstlywhen the optimal position for an entry would be far from the actualposition, secondly when many entries would have to be moved to makespace for the new entry (while satisfying the robin hood property).Due to the additional resizing threshold it seems possible, andtesting confirms that so far, that a higher fillfactor doesn't hurtperformance and saves a bit of memory. It seems better to increase itnow, before a release containing any of this code, rather than wonderin some later release.The various boundaries aren't determined in a particularly scientificmanner, they might need some fine-tuning.In all my tests the new code now, even with parallelism, performs atleast as good as the old code, in several scenarios significantlybetter.Reported-By: Dilip Kumar, Robert Haas, Kuntal GhoshDiscussion:https://postgr.es/m/CAFiTN-vagvuAydKG9VnWcoK=ADAhxmOa4ZTrmNsViBBooTnriQ@mail.gmail.comhttps://postgr.es/m/CAGz5QC+=fNTYgzMLTBUNeKt6uaWZFXJbkB5+7oWm-n9DwVxcLA@mail.gmail.com
1 parentff992c0 commitd4c62a6

File tree

1 file changed

+50
-5
lines changed

1 file changed

+50
-5
lines changed

‎src/include/lib/simplehash.h

Lines changed: 50 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -157,13 +157,24 @@ SH_SCOPE void SH_STAT(SH_TYPE *tb);
157157

158158
#include"utils/memutils.h"
159159

160-
/* conservative fillfactor for a robin hood table, might want to adjust */
161-
#defineSH_FILLFACTOR (0.8)
162-
/* increase fillfactor if we otherwise would error out */
163-
#defineSH_MAX_FILLFACTOR (0.95)
164160
/* max data array size,we allow up to PG_UINT32_MAX buckets, including 0 */
165161
#defineSH_MAX_SIZE (((uint64) PG_UINT32_MAX) + 1)
166162

163+
/* normal fillfactor, unless already close to maximum */
164+
#ifndefSH_FILLFACTOR
165+
#defineSH_FILLFACTOR (0.9)
166+
#endif
167+
/* increase fillfactor if we otherwise would error out */
168+
#defineSH_MAX_FILLFACTOR (0.98)
169+
/* grow if actual and optimal location bigger than */
170+
#ifndefSH_GROW_MAX_DIB
171+
#defineSH_GROW_MAX_DIB 25
172+
#endif
173+
/* grow if more than elements to move when inserting */
174+
#ifndefSH_GROW_MAX_MOVE
175+
#defineSH_GROW_MAX_MOVE 150
176+
#endif
177+
167178
#ifdefSH_STORE_HASH
168179
#defineSH_COMPARE_KEYS(tb,ahash,akey,b) (ahash == SH_GET_HASH(tb, b) && SH_EQUAL(tb, b->SH_KEY, akey))
169180
#else
@@ -466,12 +477,18 @@ SH_INSERT(SH_TYPE *tb, SH_KEY_TYPE key, bool *found)
466477
uint32startelem;
467478
uint32curelem;
468479
SH_ELEMENT_TYPE*data;
469-
uint32insertdist=0;
480+
uint32insertdist;
481+
482+
restart:
483+
insertdist=0;
470484

471485
/*
472486
* We do the grow check even if the key is actually present, to avoid
473487
* doing the check inside the loop. This also lets us avoid having to
474488
* re-find our position in the hashtable after resizing.
489+
*
490+
* Note that this also reached when resizing the table due to
491+
* SH_GROW_MAX_DIB / SH_GROW_MAX_MOVE.
475492
*/
476493
if (unlikely(tb->members >=tb->grow_threshold))
477494
{
@@ -536,6 +553,7 @@ SH_INSERT(SH_TYPE *tb, SH_KEY_TYPE key, bool *found)
536553
SH_ELEMENT_TYPE*lastentry=entry;
537554
uint32emptyelem=curelem;
538555
uint32moveelem;
556+
int32emptydist=0;
539557

540558
/* find next empty bucket */
541559
while (true)
@@ -550,6 +568,19 @@ SH_INSERT(SH_TYPE *tb, SH_KEY_TYPE key, bool *found)
550568
lastentry=emptyentry;
551569
break;
552570
}
571+
572+
/*
573+
* To avoid negative consequences from overly imbalanced
574+
* hashtables, grow the hashtable if collisions would require
575+
* us to move a lot of entries. The most likely cause of such
576+
* imbalance is filling a (currently) small table, from a
577+
* currently big one, in hash-table order.
578+
*/
579+
if (++emptydist>SH_GROW_MAX_MOVE)
580+
{
581+
tb->grow_threshold=0;
582+
gotorestart;
583+
}
553584
}
554585

555586
/* shift forward, starting at last occupied element */
@@ -585,6 +616,18 @@ SH_INSERT(SH_TYPE *tb, SH_KEY_TYPE key, bool *found)
585616

586617
curelem=SH_NEXT(tb,curelem,startelem);
587618
insertdist++;
619+
620+
/*
621+
* To avoid negative consequences from overly imbalanced hashtables,
622+
* grow the hashtable if collisions lead to large runs. The most
623+
* likely cause of such imbalance is filling a (currently) small
624+
* table, from a currently big one, in hash-table order.
625+
*/
626+
if (insertdist>SH_GROW_MAX_DIB)
627+
{
628+
tb->grow_threshold=0;
629+
gotorestart;
630+
}
588631
}
589632
}
590633

@@ -878,6 +921,8 @@ SH_STAT(SH_TYPE *tb)
878921
#undef SH_MAKE_NAME_
879922
#undef SH_FILLFACTOR
880923
#undef SH_MAX_FILLFACTOR
924+
#undef SH_GROW_MAX_DIB
925+
#undef SH_GROW_MAX_MOVE
881926
#undef SH_MAX_SIZE
882927

883928
/* types */

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp