NotificationsYou must be signed in to change notification settings
Fork5
Star26

Commitb30d3ea

committed

Add a macro templatized hashtable.

dynahash.c hash tables aren't quite fast enough for someuse-cases. There are several reasons for lacking performance:- the use of chaining for collision handling makes them cache inefficient, that's especially an issue when the tables get bigger.- as the element sizes for dynahash are only determined at runtime, offset computations are somewhat expensive- hash and element comparisons are indirect function calls, causing unnecessary pipeline stalls- it's two level structure has some benefits (somewhat natural partitioning), but increases the number of indirectionsto fix several of these the hash tables have to be adjusted to theindividual use-case at compile-time. C unfortunately doesn't provide agood way to do compile code generation (like e.g. c++'s templates forall their weaknesses do). Thus the somewhat ugly approach taken here isto allow for code generation using a macro-templatized header file,which generates functions and types based on a prefix and otherparameters.Later patches use this infrastructure to use such hash tables fortidbitmap.c (bitmap scans) and execGrouping.c (hash aggregation,...). In queries where these use up a large fraction of the time, thishas been measured to lead to performance improvements of over 100%.There are other cases where this could be useful (e.g. catcache.c).The hash table design chosen is a variant of linear open-addressing. Thebiggest disadvantage of simple linear addressing schemes are highlyvariable lookup times due to clustering, and deletions leaving a lot oftombstones around. To address these issues a variant of "robin hood"hashing is employed. Robin hood hashing optimizes chaining lengths bymoving elements close to their optimal bucket ("rich" elements), out ofthe way if a to-be-inserted element is further away from its optimalposition (i.e. it's "poor"). While that can make insertions slower, theaverage lookup performance is a lot better, and higher fill factors canbe used in a still performant manner. To avoid tombstones - whichnormally solve the issue that a deleted node's presence is relevant todetermine whether a lookup needs to continue looking or is done -buckets following a deleted element are shifted backwards, unlessthey're empty or already at their optimal position.There's further possible improvements that can be made to thisimplementation. Amongst others:- Use distance as a termination criteria during searches. This is generally a good idea, but I've been able to see the overhead of distance calculations in some cases.- Consider combining the 'empty' status into the hashvalue, and enforce storing the hashvalue. That could, in some cases, increase memory density and remove a few instructions.- Experiment further with the, very conservatively choosen, fillfactor.- Make maximum size of hashtable configurable, to allow storing very very large tables. That'd require 64bit hash values to be more common than now, though.- some smaller memcpy calls could be optimized to copy larger chunksBut since the new implementation is already considerably faster thandynahash it seem sensible to start using it.Reviewed-By: Tomas VondraDiscussion: <20160727004333.r3e2k2y6fvk2ntup@alap3.anarazel.de>

1 parentaa3ca5e commitb30d3eaCopy full SHA for b30d3ea

File tree

2 files changed

+881

-0

lines changed

src
- include/lib
  - simplehash.h
- tools/pgindent
  - typedefs.list

2 files changed

+881

-0

lines changed

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commitb30d3ea

File tree

2 files changed

2 files changed

0 commit comments