boost::hash is an enhanced implementation of thehash function object specified byC++11 asstd::hash. It is the default hash function forBoost.Unordered,Boost.Intrusive's unordered associativecontainers,Boost.MultiIndex's hashindices, andBoost.Bimap'sunordered_set_of.
Out of the box,boost::hash supports
standard integral types (integers, character types, andbool);
standard floating point types (float,double,long double);
pointers (to objects and to functions, but not pointers to members)andnullptr;
enumeration types;
C arrays;
std::complex;
tuple-like types, such asstd::pair,std::tuple, and user-definedtypes that specializestd::tuple_size and provideget<I>;
sequence-like types, both standard and user-defined (sequence-like typeshavebegin() andend() member functions returning iterators);
unordered sequences, standard or user-defined (sequences for which the hashvalue does not depend on the element order, such asstd::unordered_set andstd::unordered_map);
described structs and classes — ones that have been annotated with theBOOST_DESCRIBE_STRUCT orBOOST_DESCRIBE_CLASS macros fromBoost.Describe;
std::unique_ptr,std::shared_ptr;
std::type_index;
std::error_code,std::error_condition;
std::optional;
std::variant,std::monostate.
boost::hash is extensible; it’s possible for a user-defined typeX to makeiself hashable viaboost::hash<X> by defining an appropriate overload of thefunctionhash_value. Many, if not most, Boost types already contain thenecessary support.
boost::hash meets the requirements forstd::hash specified in the C++11standard, namely, that for two different input values their corresponding hashvalues are either guaranteed to be distinct, or the probability of their beingthe same (a hash collision) is small. Standard unordered containers, and thehash-based Boost containers, are designed to work well with such hash functions.
boost::hash does not meet the stronger requirements often placed on hashfunctions in a more general context. In particular, the hash function is notcryptographic, is not collision-resistant against a determined adversary, anddoes not necessarily possess good "avalanche" properties; that is, small(single bit) perturbations in the input do not necessarily result in large(half bits changing) perturbations in the output.
In particular,boost::hash has traditionally been the identity function forall integral types that fit intostd::size_t, because this guarantees lack ofcollisions and is as fast as possible.
Added thehash_is_avalanching trait class.
C++03 is no longer supported.
Added an overload ofhash_value forstd::nullptr_t.
Addedis_tuple_like and an overload ofhash_value fortuple-like types.
Changed string hashing to usemulxp1_hash,improving both quality and speed. This changes the hash valuesfor string-like types (ranges ofchar,signed char,unsigned char,std::byte,char8_t).
Major update.
The specializations ofboost::hash have been removed; it nowalways callshash_value.
Support forBOOST_HASH_NO_EXTENSIONS has been removed. Theextensions are always enabled.
All standard containers are now supported. This includesstd::forward_list and the unordered associative containers.
User-defined containers (types that havebegin() andend()member functions that return iterators) are now supported outof the box.
Described structs and classes (those annotated withBOOST_DESCRIBE_STRUCT orBOOST_DESCRIBE_CLASS) are nowsupported out of the box.
hash_combine has been improved. This changes the hash valuesof composite (container and tuple) types and of scalar typesbigger thansize_t.
The performance (and quality, as a result of the above change)of string hashing has been improved.boost::hash for stringsnow passes SMHasher in 64 bit mode.
The documentation has been substantially revised to reflectthe changes.
Fixedhash_combine so that its behavior no longer dependson whethersize_t is the exact same type asboost::uint64_t(which wasn’t the case on macOS). This changes the hash valuesof composite (container and tuple) types on macOS.
When using a Boost container such asBoost.Unordered, you don’t need to doanything to useboost::hash as it’s the default. To find out how to usea user-defined type, read thesection on extending boost::hashfor user types.
If you wish to useboost::hash with the standard unordered associativecontainers, pass it as a template parameter:
std::unordered_multiset<int,boost::hash<int>>set_of_ints;std::unordered_set<std::pair<int,int>,boost::hash<std::pair<int,int>>>set_of_pairs;std::unordered_map<int,std::string,boost::hash<int>>map_int_to_string;To useboost::hash directly, create an instance and call it as a function:
#include<boost/container_hash/hash.hpp>intmain(){boost::hash<std::string>string_hash;std::size_th=string_hash("Hash me");}or alternatively:
#include<boost/container_hash/hash.hpp>intmain(){std::size_th=boost::hash<std::string>()("Hash me");}For an example of generic use, here is a function to generate a vectorcontaining the hashes of the elements of a container:
template<classContainer>std::vector<std::size_t>get_hashes(Containerconst&x){std::vector<std::size_t>hashes;std::transform(x.begin(),x.end(),std::back_inserter(hashes),boost::hash<typenameContainer::value_type>());returnhashes;}boost::hash is implemented by calling the functionhash_value. Thenamespace isn’t specified so that it can detect overloads via argumentdependant lookup. So if there is a free functionhash_value in the samenamespace as a user type, it will get called.
If you have a structurelibrary::book, where each book is uniquely definedby its memberid:
namespacelibrary{structbook{intid;std::stringauthor;std::stringtitle;// ....};booloperator==(bookconst&a,bookconst&b){returna.id==b.id;}}Then all you would need to do is write the functionlibrary::hash_value:
namespacelibrary{std::size_thash_value(bookconst&b){boost::hash<int>hasher;returnhasher(b.id);}}And you can now useboost::hash with book:
library::bookknife(3458,"Zane Grey","The Hash Knife Outfit");library::bookdandelion(1354,"Paul J. Shanley","Hash & Dandelion Greens");boost::hash<library::book>book_hasher;std::size_tknife_hash_value=book_hasher(knife);// If std::unordered_set is available:std::unordered_set<library::book,boost::hash<library::book>>books;books.insert(knife);books.insert(library::book(2443,"Lindgren, Torgny","Hash"));books.insert(library::book(1953,"Snyder, Bernadette M.","Heavenly Hash: A Tasty Mix of a Mother's Meditations"));assert(books.find(knife)!=books.end());assert(books.find(dandelion)==books.end());The full example can be found inexamples/books.hpp andexamples/books.cpp.
Tip | When writing a hash function, first look at how the equality functionworks. Objects that are equal must generate the same hash value. When objectsare not equal they should generate different hash values. In this objectequality was based just onid so the hash function only hashesid. If itwas based on the object’s name and author then the hash function should takethem into account (how to do this is discussed in the next section). |
Say you have a point class, representing a two dimensional location:
classpoint{intx;inty;public:point():x(0),y(0){}point(intx,inty):x(x),y(y){}booloperator==(pointconst&other)const{returnx==other.x&&y==other.y;}};and you wish to use it as the key for anunordered_map. You need tocustomise the hash for this structure. To do this we need to combine thehash values forx andy. The functionboost::hash_combine is suppliedfor this purpose:
classpoint{...friendstd::size_thash_value(pointconst&p){std::size_tseed=0;boost::hash_combine(seed,p.x);boost::hash_combine(seed,p.y);returnseed;}...};Calls tohash_combine incrementally build the hash from the differentmembers ofpoint, it can be repeatedly called for any number of elements.It callshash_value on the supplied element, and combines it with the seed.
Full code for this example is atexamples/point.cpp.
Note that when usingboost::hash_combine the order of the calls matters.
std::size_tseed=0;boost::hash_combine(seed,1);boost::hash_combine(seed,2);and
std::size_tseed=0;boost::hash_combine(seed,2);boost::hash_combine(seed,1);result in a different values inseed.
To calculate the hash of an iterator range you can useboost::hash_range:
std::vector<std::string>some_strings;std::size_thash=boost::hash_range(some_strings.begin(),some_strings.end());Sincehash_range works by repeatedly invokinghash_combine on the elementsof the range, the hash value will also be dependent on the element order.
If you are calculating a hash value for a range where the order of the datadoesn’t matter, such asunordered_set, you can useboost::hash_unordered_range instead.
std::unordered_set<std::string>set;std::size_thash=boost::hash_unordered_range(set.begin(),set.end());When writing template classes, you might not want to include the mainhash.hpp header as it’s quite an expensive include that brings in a lot ofother headers, so instead you can include the<boost/container_hash/hash_fwd.hpp> header which forward declaresboost::hash,boost::hash_combine,boost::hash_range, andboost::hash_unordered_range. You’ll need to include the main header beforeinstantiatingboost::hash. When using a container that usesboost::hash itshould do that for you, so your type will work fine with the Boost hashcontainers. There’s an example of this inexamples/template.hpp andexamples/template.cpp.
To avoid including evenhash_fwd.hpp - which still requires the contentsof Boost.ContainerHash to be physically present - you are allowed to copy thedeclarations fromhash_fwd.hpp (and only those) directly into your ownheader. This is a special exception guaranteed by the library; in general,you can’t declare library functions, Boost or otherwise, without risk ofbreakage in a subsequent release.
Let’s look at ourpoint class again:
classpoint{intx;inty;public:point():x(0),y(0){}point(intx,inty):x(x),y(y){}};If you’re using C++14 or above, a much easier way to addsupport forboost::hash topoint is by usingBoost.Describe (andget an automatic definition ofoperator== for free):
#include<boost/describe/class.hpp>#include<boost/describe/operators.hpp>classpoint{intx;inty;BOOST_DESCRIBE_CLASS(point,(),(),(),(x,y))public:point():x(0),y(0){}point(intx,inty):x(x),y(y){}};usingboost::describe::operators::operator==;usingboost::describe::operators::operator!=;(Full code for this example is atexamples/point2.cpp.)
Since thepoint class has been annotated withBOOST_DESCRIBE_CLASS,the library can enumerate its members (and base classes) and automaticallysynthesize the appropriatehash_value overload for it, without us needingto do so.
This header contains forward declarations for the library primitives.These declarations are guaranteed to be relatively stable, that is,best effort will be expended on their not changing from release torelease, allowing their verbatim copy into user headers that do notwish to physically depend on Boost.ContainerHash.
namespaceboost{namespacecontainer_hash{template<classT>structis_range;template<classT>structis_contiguous_range;template<classT>structis_unordered_range;template<classT>structis_described_class;template<classT>structis_tuple_like;}// namespace container_hashtemplate<classT>structhash;template<classT>voidhash_combine(std::size_t&seed,Tconst&v);template<classIt>voidhash_range(std::size_t&seed,Itfirst,Itlast);template<classIt>std::size_thash_range(Itfirst,Itlast);template<classIt>voidhash_unordered_range(std::size_t&seed,Itfirst,Itlast);template<classIt>std::size_thash_unordered_range(Itfirst,Itlast);template<classHash>structhash_is_avalanching;}// namespace boostDefinesboost::hash, and helper functions.
namespaceboost{template<classT>structhash;template<classT>voidhash_combine(std::size_t&seed,Tconst&v);template<classIt>voidhash_range(std::size_t&seed,Itfirst,Itlast);template<classIt>std::size_thash_range(Itfirst,Itlast);template<classIt>voidhash_unordered_range(std::size_t&seed,Itfirst,Itlast);template<classIt>std::size_thash_unordered_range(Itfirst,Itlast);// Enabled only when T is an integral typetemplate<classT>std::size_thash_value(Tv);// Enabled only when T is an enumeration typetemplate<classT>std::size_thash_value(Tv);// Enabled only when T is a floating point typetemplate<classT>std::size_thash_value(Tv);template<classT>std::size_thash_value(T*const&v);template<classT,std::size_tN>std::size_thash_value(Tconst(&v)[N]);template<classT>std::size_thash_value(std::complex<T>const&v);template<classA,classB>std::size_thash_value(std::pair<A,B>const&v);// Enabled only when container_hash::is_tuple_like<T>::value is truetemplate<classT>std::size_thash_value(Tconst&v);// Enabled only when container_hash::is_range<T>::value is truetemplate<classT>std::size_thash_value(Tconst&v);// Enabled only when container_hash::is_contiguous_range<T>::value is truetemplate<classT>std::size_thash_value(Tconst&v);// Enabled only when container_hash::is_unordered_range<T>::value is truetemplate<classT>std::size_thash_value(Tconst&v);// Enabled only when container_hash::is_described_class<T>::value is truetemplate<classT>std::size_thash_value(Tconst&v);template<classT>std::size_thash_value(std::shared_ptr<T>const&v);template<classT,classD>std::size_thash_value(std::unique_ptr<T,D>const&v);std::size_thash_value(std::type_indexconst&v);std::size_thash_value(std::error_codeconst&v);std::size_thash_value(std::error_conditionconst&v);template<classT>std::size_thash_value(std::optional<T>const&v);std::size_thash_value(std::monostatev);template<class...T>std::size_thash_value(std::variant<T...>const&v);}// namespace boosttemplate<classT>structhash{std::size_toperator()(Tconst&v)const;};std::size_toperator()(Tconst&v)const;hash_value(v).
Only throws ifhash_value(v) throws.
The call tohash_value is unqualified, so that user-suppliedoverloads will be found via argument dependent lookup.
template<classT>voidhash_combine(std::size_t&seed,Tconst&v);Called repeatedly to incrementally create a hash value from several variables.
Updatesseed with a new hash value generated bydeterministically combining it with the result ofboost::hash<T>()(v).
Only throws ifboost::hash<T>()(v) throws. On exception,seed is not updated.
Equivalent toseed = combine(seed, boost::hash<T>()(v)),wherecombine(s, v) is a mixing function that takes two arguments oftypestd::size_t and returnsstd::size_t, with the following desirableproperties:
For a constants, whenv takes all possiblesize_t values,combine(s, v) should also take all possiblesize_t values, producinga sequence that is close to random; that is, it should be a randompermutation.
This guarantees that for a givenseed,combine does not introducehash collisions when none were produced byboost::hash<T>(v); that is,it does not lose information from the input. It also implies thatcombine(s, v), as a function ofv, has good avalanche properties;that is, small (e.g. single bit) perturbations in the inputv lead tolarge perturbations in the return value (half of the output bits changing,on average).
For two different seedss1 ands2,combine(s1, v) andcombine(s2, v), treated as functions ofv, should produce twodifferent random permutations.
combine(0, 0) should not be 0. Since a common initial value ofseedis zero,combine(0, 0) == 0 would imply that applyinghash_combine onany sequence of zeroes, regardless of length, will produce zero. This isundesirable, as it would lead to e.g.std::vector<int>() andstd::vector<int>(4) to have the same hash value.
The current implementation uses the functionmix(s + 0x9e3779b9 + v) ascombine(s, v), wheremix(x) is a high quality mixing function that is abijection over thestd::size_t values, of the form
x^=x>>k1;x*=m1;x^=x>>k2;x*=m2;x^=x>>k3;where the constantsk1,k2,k3,m1,m2 are suitably chosen.
Note thatmix(0) is 0. This is why we add the arbitrary constant0x9e3779b9 to meet the third requirement above.
template<classIt>voidhash_range(std::size_t&seed,Itfirst,Itlast);Whentypename std::iterator_traits<It>::value_type is notchar,signed char,unsigned char,std::byte, orchar8_t,
for(;first!=last;++first){boost::hash_combine<typenamestd::iterator_traits<It>::value_type>(seed,*first);}Otherwise, bytes from[first, last) are coalesced and hashed in anunspecified manner. This is done in order to improve performance when hashingstrings.
For chars, the current implementation usesmulxp1_hash whenstd::size_t is64 bit, andmulxp1_hash32 when it’s 32 bit.
template<classIt>std::size_thash_range(Itfirst,Itlast);size_tseed=0;boost::hash_range(seed,first,last);returnseed;template<classIt>voidhash_unordered_range(std::size_t&seed,Itfirst,Itlast);Updatesseed with the values ofboost::hash<typename std::iterator_traits<It>::value_type>()(*i)for eachi in[first, last), such that the order of elements doesnot affect the final result.
template<classIt>std::size_thash_unordered_range(Itfirst,Itlast);size_tseed=0;boost::hash_unordered_range(seed,first,last);returnseed;// Enabled only when T is an integral typetemplate<classT>std::size_thash_value(Tv);When the value ofv fits intostd::size_t, whenT is an unsigned type,or intossize_t, whenT is a signed type,static_cast<std::size_t>(v).
Otherwise, an unspecified value obtained by mixing the value bits ofv.
// Enabled only when T is an enumeration typetemplate<classT>std::size_thash_value(Tv);static_cast<std::size_t>(v).
hash_value(std::to_underlying(v)) would be better, but C++03compatibility mandates the current implementation.
// Enabled only when T is a floating point typetemplate<classT>std::size_thash_value(Tv);An unspecified value obtained by mixing the value bits ofv.
Whensizeof(v) <= sizeof(std::size_t), the bits ofv are returnedas-is (except in the case of -0.0, which is treated as +0.0).
template<classT>std::size_thash_value(T*const&v);An unspecified value derived fromreinterpret_cast<std::uintptr_t>(v).
template<classT,std::size_tN>std::size_thash_value(Tconst(&v)[N]);boost::hash_range( v, v + N ).
template<classT>std::size_thash_value(std::complex<T>const&v);An unspecified value derived fromboost::hash<T>()(v.real()) andboost::hash<T>()(v.imag()) such that, ifv.imag() == 0, the valueis equal toboost::hash<T>()(v.real()).
A more straightforward implementation would just have usedhash_combineonv.real() andv.imag(), but the historical guarantee that real-valuedcomplex numbers should match the hash value of their real part precludes it.
This guarantee may be dropped in a future release, as it’s of questionableutility.
template<classA,classB>std::size_thash_value(std::pair<A,B>const&v);std::size_tseed=0;boost::hash_combine(seed,v.first);boost::hash_combine(seed,v.second);returnseed;// Enabled only when container_hash::is_tuple_like<T>::value is truetemplate<classT>std::size_thash_value(Tconst&v);std::size_tseed=0;usingstd::get;boost::hash_combine(seed,get<0>(v));boost::hash_combine(seed,get<1>(v));// ...boost::hash_combine(seed,get<N-1>(v));returnseed;whereN isstd::tuple_size<T>::value.
This overload is only enabled whencontainer_hash::is_range<T>::value isfalse.
// Enabled only when container_hash::is_range<T>::value is truetemplate<classT>std::size_thash_value(Tconst&v);boost::hash_range( v.begin(), v.end() ).
This overload is only enabled whencontainer_hash::is_contiguous_range<T>::value andcontainer_hash::is_unordered_range<T>::value are bothfalse.
It handles all standard containers that aren’t contiguous or unordered, suchasstd::deque,std::list,std::set,std::map.
// Enabled only when container_hash::is_contiguous_range<T>::value is truetemplate<classT>std::size_thash_value(Tconst&v);boost::hash_range( v.data(), v.data() + v.size() ).
This overload handles all standard contiguous containers, such asstd::string,std::vector,std::array,std::string_view.
// Enabled only when container_hash::is_unordered_range<T>::value is truetemplate<classT>std::size_thash_value(Tconst&v);boost::hash_unordered_range( v.begin(), v.end() ).
This overload handles the standard unordered containers, such asstd::unordered_set andstd::unordered_map.
// Enabled only when container_hash::is_described_class<T>::value is truetemplate<classT>std::size_thash_value(Tconst&v);std::size_tseed=0;boost::hash_combine(seed,b1);boost::hash_combine(seed,b2);// ...boost::hash_combine(seed,bM);boost::hash_combine(seed,m1);boost::hash_combine(seed,m2);// ...boost::hash_combine(seed,mN);returnseed;wherebi are the bases ofv andmi are its members.
template<classT>std::size_thash_value(std::shared_ptr<T>const&v);template<classT,classD>std::size_thash_value(std::unique_ptr<T,D>const&v);boost::hash<T*>( v.get() ).
std::size_thash_value(std::type_indexconst&v);v.hash_code().
std::size_thash_value(std::error_codeconst&v);std::size_thash_value(std::error_conditionconst&v);std::size_tseed=0;boost::hash_combine(seed,v.value());boost::hash_combine(seed,&v.category());returnseed;template<classT>std::size_thash_value(std::optional<T>const&v);For a disengagedv, an unspecified constant value; otherwise,boost::hash<T>()( *v ).
std::size_thash_value(std::monostatev);An unspecified constant value.
template<class...T>std::size_thash_value(std::variant<T...>const&v);std::size_tseed=0;boost::hash_combine(seed,v.index());boost::hash_combine(seed,x);returnseed;wherex is the currently contained value inv.
std::bad_variant_access whenv.valueless_by_exception() istrue.
Defines the traitboost::hash_is_avalanching.
namespaceboost{template<classHash>structhash_is_avalanching;}// namespace boosttemplate<classHash>structhash_is_avalanching{staticconstexprboolvalue=/* see below */;};hash_is_avalanching<Hash>::value is:
false ifHash::is_avalanching is not present,
Hash::is_avalanching::value if this is present and convertible at compile time to abool,
true ifHash::is_avalanching isvoid (this usage is deprecated),
ill-formed otherwise.
A hash function is said to have theavalanching property if small changesin the input translate to large changes in the returned hash code—ideally, flipping one bit in the representation of the input value resultsin each bit of the hash code flipping with probability 50%. Librariessuch asBoost.Unordered consult this traitto determine if the supplied hash function is of high quality.boost::hash forstd::basic_string<Ch> andstd::basic_string_view<Ch>has this trait set totrue whenCh is an integral type (this includesstd::string andstd::string_view, among others).Users can set this trait for a particularHash type by:
Inserting the nestedis_avalanching typedef in the class definitionif they have access to its source code.
Writing a specialization ofboost::hash_is_avalanchingforHash.
Note that usage of this trait is not restricted to hash functions producedwith Boost.ContainerHash.
Defines the traitboost::container_hash::is_range.
namespaceboost{namespacecontainer_hash{template<classT>structis_range;}// namespace container_hash}// namespace boosttemplate<classT>structis_range{staticconstexprboolvalue=/* see below */;};is_range<T>::value istrue when, for a const valuex of typeT,x.begin() andx.end() return iterators of the same typeIt (such thatstd::iterator_traits<It> is a valid specialization.)
Users are allowed to specializeis_range for their types if thedefault behavior does not deduce the correct value.
Defines the traitboost::container_hash::is_contiguous_range.
namespaceboost{namespacecontainer_hash{template<classT>structis_contiguous_range;}// namespace container_hash}// namespace boosttemplate<classT>structis_contiguous_range{staticconstexprboolvalue=/* see below */;};is_contiguous_range<T>::value istrue whenis_range<T>::value istrue and when, for a const valuex of typeT,x.data() returnsa pointer to a type that matches thevalue_type of the iterator returnedbyx.begin() andx.end(), andx.size() returns a value of an integraltype.
Users are allowed to specializeis_contiguous_range for their typesif the default behavior does not deduce the correct value.
Defines the traitboost::container_hash::is_unordered_range.
namespaceboost{namespacecontainer_hash{template<classT>structis_unordered_range;}// namespace container_hash}// namespace boosttemplate<classT>structis_unordered_range{staticconstexprboolvalue=/* see below */;};is_unordered_range<T>::value istrue whenis_range<T>::value istrue and whenT::hasher is a valid type.
Users are allowed to specializeis_unordered_range for their typesif the default behavior does not deduce the correct value.
Defines the traitboost::container_hash::is_described_class.
namespaceboost{namespacecontainer_hash{template<classT>structis_described_class;}// namespace container_hash}// namespace boosttemplate<classT>structis_described_class{staticconstexprboolvalue=/* see below */;};is_described_class<T>::value istrue whenboost::describe::has_describe_bases<T>::value istrue,boost::describe::has_describe_members<T>::value istrue, andT is not a union.
Users are allowed to specializeis_described_class for their typesif the default behavior does not deduce the correct value.
Defines the traitboost::container_hash::is_tuple_like.
namespaceboost{namespacecontainer_hash{template<classT>structis_tuple_like;}// namespace container_hash}// namespace boosttemplate<classT>structis_tuple_like{staticconstexprboolvalue=/* see below */;};is_tuple_like<T>::value istrue whenstd::tuple_size<T>::valueis valid.
Users are allowed to specializeis_tuple_like for their typesif the default behavior does not deduce the correct value.
Many hash functions strive to have little correlation between the input andoutput values. They attempt to uniformally distribute the output values forvery similar inputs. This hash function makes no such attempt. In fact, forintegers, the result of the hash function is often just the input value. Sosimilar but different input values will often result in similar but differentoutput values. This means that it is not appropriate as a general hashfunction. For example, a hash table may discard bits from the hash functionresulting in likely collisions, or might have poor collision resolution whenhash values are clustered together. In such cases this hash function willperform poorly.
But the standard has no such requirement for the hash function, it justrequires that the hashes of two different values are unlikely to collide.Containers or algorithms designed to work with the standard hash function willhave to be implemented to work well when the hash function’s output iscorrelated to its input. Since they are paying that cost a higher quality hashfunction would be wasteful.
The way one customizes the standardstd::hash function object for usertypes is via a specialization.boost::hash chooses a different mechanism — an overload of a free functionhash_value in the user namespace that isfound via argument-dependent lookup.
Both approaches have their pros and cons. Specializing the function objectis stricter in that it only applies to the exact type, and not to derivedor convertible types. Defining a function, on the other hand, is easierand more convenient, as it can be done directly in the type definition asaninlinefriend.
The fact that overloads can be invoked via conversions did cause issues inan earlier iteration of the library that definedhash_value for allintegral types separately, includingbool. Especially under C++03,which doesn’t haveexplicit conversion operators, some types wereconvertible tobool to allow their being tested in e.g.if statements,which caused them to hash to 0 or 1, rarely what one expects or wants.
This, however, was fixed by declaring the built-inhash_value overloadsto be templates constrained on e.g.std::is_integral or its moralequivalent. This causes types convertible to an integral to no longermatch, avoiding the problem.
In general, the library does not promise that the hash values will staythe same from release to release (otherwise improvements would beimpossible). However, historically values have been quite stable. Beforerelease 1.81, the previous changes have been in 1.56 (a betterhash_combine) and 1.78 (macOS-specific change tohash_combine.)
Code should generally not depend on specific hash values, but for thosewilling to take the risk of occasional breaks due to hash value changes,the library now has a test that checks hash values for a number of typesagainst reference values (test/hash_reference_values.cpp),whoseversion historycan be used as a rough guide to when hash values have changed, and for whattypes.
The initial implementation of the library was based on Issue 6.18 of theLibrary Extension Technical Report Issues List(pages 63-67) which proposed the following implementation ofhash_combine:
template<classT>voidhash_combine(size_t&seed,Tconst&v){seed^=hash_value(v)+(seed<<6)+(seed>>2);}taken from the paper"Methods for Identifying Versioned and Plagiarised Documents"by Timothy C. Hoad and Justin Zobel.
During the Boost formal review, Dave Harris pointed out that this suffersfrom the so-called "zero trap"; ifseed is initially 0, and all theinputs are 0 (or hash to 0),seed remains 0 no matter how many inputvalues are combined.
This is an undesirable property, because it causes containers of zeroesto have a zero hash value regardless of their sizes.
To fix this, the arbitrary constant0x9e3779b9 (the golden ratio in a32 bit fixed point representation) was added to the computation, yielding
template<classT>voidhash_combine(size_t&seed,Tconst&v){seed^=hash_value(v)+0x9e3779b9+(seed<<6)+(seed>>2);}This is what shipped in Boost 1.33, the first release containing the library.
This function was a reasonable compromise between quality and speed for itstime, when the input consisted ofchars, but it’s less suitable forcombining arbitrarysize_t inputs.
In Boost 1.56, it was replaced by functions derived from Austin Appleby’sMurmurHash2 hash function round.
In Boost 1.81, it was changed again — to the equivalent ofmix(seed + 0x9e3779b9 + hash_value(v)), wheremix(x) is a high qualitymixing function that is a bijection over thesize_t values, of the form
x^=x>>k1;x*=m1;x^=x>>k2;x*=m2;x^=x>>k3;This type of mixing function was originally devised by Austin Appleby asthe "final mix" part of his MurmurHash3 hash function. He used
x^=x>>33;x*=0xff51afd7ed558ccd;x^=x>>33;x*=0xc4ceb9fe1a85ec53;x^=x>>33;as the64 bit functionfmix64 and
x^=x>>16;x*=0x85ebca6b;x^=x>>13;x*=0xc2b2ae35;x^=x>>16;as the32 bit functionfmix32.
Several improvements of the 64 bit function have been subsequently proposed,byDavid Stafford,Pelle Evensen,andJon Maiga. We currently use JonMaiga’s function
x^=x>>32;x*=0xe9846af9b1a615d;x^=x>>32;x*=0xe9846af9b1a615d;x^=x>>28;Under 32 bit, we use a mixing function proposed by "TheIronBorn" in aGithub issue intherepository ofHash Prospector by Chris Wellons:
x^=x>>16;x*=0x21f0aaad;x^=x>>15;x*=0x735a2d97;x^=x>>15;With this improvedhash_combine,boost::hash for strings now passes theSMHasher test suite by Austin Appleby(for a 64 bitsize_t).
The traditional implementation ofhash_range(seed, first, last) has been
for(;first!=last;++first){boost::hash_combine<typenamestd::iterator_traits<It>::value_type>(seed,*first);}(the explicit template parameter is needed to support iterators with proxyreturn types such asstd::vector<bool>::iterator.)
This is logical, consistent and straightforward. In the common case wheretypename std::iterator_traits<It>::value_type ischar — which it isin the common case ofboost::hash<std::string> — this however leaves alot of performance on the table, because processing eachchar individuallyis much less efficient than processing several in bulk.
In Boost 1.81,hash_range was changed to process elements of typechar,signed char,unsigned char,std::byte, orchar8_t, four of a time.Auint32_t is composed fromfirst[0] tofirst[3], and thatuint32_tis fed tohash_combine.
In Boost 1.82,hash_range for these types was changed to usemulxp1_hash. This improves bothquality and speed of string hashing.
Note thathash_range has also traditionally guaranteed that the same elementsequence yields the same hash value regardless of the iterator type. Thisproperty remains valid after the changes tochar range hashing.hash_range,applied to thechar sequence{ 'a', 'b', 'c' }, results in the same valuewhether the sequence comes fromchar[3],std::string,std::deque<char>,orstd::list<char>.
A Proposal to Add Hash Tables to the Standard Library
http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2003/n1456.html
The hash table proposal explains much of the design. The hash function object is discussed in Section D.
The C++ Standard Library Technical Report
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1836.pdf
Contains the hash function specification in section 6.3.2.
Library Extension Technical Report Issues List
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1837.pdf
The library implements the extension described in Issue 6.18, pages 63-67.
Methods for Identifying Versioned and Plagiarised Documents
Timothy C. Hoad, Justin Zobel
https://people.eng.unimelb.edu.au/jzobel/fulltext/jasist03thz.pdf
Contains the hash function that the initial implementation ofboost::hash_combine was based on.
Performance in Practice of String Hashing Functions
M.V. Ramakrishna, J. Zobel
In Proc. Int. Conf. on Database Systems for Advanced Applications, pages 215-223, Melbourne, Australia, April 1997.
https://www.comp.nus.edu.sg/~lingtw/dasfaa_proceedings/DASFAA97/P215.pdf
Referenced in the above paper as the source of the hash function.
MurmurHash3 hash function source
Austin Appleby
https://github.com/aappleby/smhasher/blob/61a0530f28277f2e850bfc39600ce61d02b518de/src/MurmurHash3.cpp#L65-L90
Austin Appleby’s 32 and 64 bit finalization mixing functions thatintroduced the "xmxmx" general form of a high quality bijectivetransformation that approximates a random permutation.
SMHasher hash function test suite
Austin Appleby
https://github.com/aappleby/smhasher
Contains a battery of tests for evaluating hash functions. The current64 bit implementation ofboost::hash for strings passes SMHasher.Previous iterations did not.
Better Bit Mixing - Improving on MurmurHash3’s 64-bit Finalizer
David Stafford
https://zimbry.blogspot.com/2011/09/better-bit-mixing-improving-on.html
Describes the so-called "variant 13" mixing function, an improvementoverfmix64 from MurmurHash3, made famous by its adoption by thesplitmix64random number generator.
Stronger, better, morer, Moremur; a better Murmur3-type mixer
Pelle Evensen
https://mostlymangling.blogspot.com/2019/12/stronger-better-morer-moremur-better.html
Describes Moremur, an improvement over MurmurHash3fmix64 and Stafford"variant 13".
Improved mx3 and the RRC test
Jon Maiga
http://jonkagstrom.com/mx3/mx3_rev2.html
Contains another improvement over MurmurHash3fmix64 and "variant 13". Thisis what the current implementation ofboost::hash_combine uses whenstd::size_t is 64 bits.
Prospecting for Hash Functions
Chris Wellons
https://nullprogram.com/blog/2018/07/31/
DescribesHash Prospector,a utility for discovering and evaluating mixing functions.
New best known functions
"TheIronBorn"
https://github.com/skeeto/hash-prospector/issues/19
Describes a good 32 bit mixing function, used by the current implementationofboost::hash_combine whenstd::size_t is 32 bits.
This library is based on the design by Peter Dimov. During the initial development Joaquín M López Muñoz made many useful suggestions and contributed fixes.
The formal review was managed by Thorsten Ottosen, and the library reviewed by: David Abrahams, Alberto Barbati, Topher Cooper, Caleb Epstein, Dave Harris, Chris Jefferson, Bronek Kozicki, John Maddock, Tobias Swinger, Jaap Suter, Rob Stewart and Pavel Vozenilek. Since then, further constructive criticism has been made by Daniel Krügler, Alexander Nasonov and 沈慧峰.
The implementation of the hash function for pointers is based on suggestions made by Alberto Barbati and Dave Harris. Dave Harris also suggested an important improvement toboost::hash_combine that was taken up.
Some useful improvements to the floating point hash algorithm were suggested by Daniel Krügler.
The original implementation came from Jeremy B. Maitin-Shepard’s hash table library, although this is a complete rewrite.
The documentation was converted from Quickbook to AsciiDoc by Christian Mazakas.
Moved library into its own module,container_hash.
Moved headers for new module name, now at:<boost/container_hash/hash.hpp>,<boost/container_hash/hash_fwd.hpp>,<boost/container_hash/extensions.hpp>.
Added forwarding headers to support the old headers locations.
Supportstd::string_view,std::error_code,std::error_condition,std::optional,std::variant,std::monostate where available.
Update include paths from other Boost libraries.
Manually write out tuple overloads, rather than using the preprocessor to generate them. Should improve usability, due to better error messages, and easier debugging.
Fix tutorial example (#11017).
Quick fix for hashingvector<bool> when using libc++. Will try to introduce a more general fix in the next release.
Avoid float comparison warning when using Clang - this workaround was already in place for GCC, and was used when Clang pretends to be GCC, but the warning was appearing when running Clang in other contexts.
Support forchar16_t,char32_t,u16string,u32string
Fix for recent versions of Visual C++ which have removedstd::unary_function andstd::binary_function (#12353).
Fixed some warnings.
Only define hash forstd::wstring when we know we have awchar_t. Otherwise there’s a compile error as there’s no overload for hashing the characters in wide strings (#8552).
Fixed strict aliasing violation (GitHub #3).
Removed some Visual C++ 6 workarounds.
Ongoing work on improvinghash_combine. This changes the combine function which was previously defined in the reference documentation.
Ticket 7957: Fixed a typo.
Add support forboost::int128_type andboost::uint128_type where available - currently only__int128 andunsigned __int128 on some versions of gcc.
On platforms that are known to have the standard floating point functions, don’t use automatic detection - which can break if there are ambiguous overloads.
Fix undefined behaviour when using the binaryfloat hash (Thomas Heller).
Restoreenum support, which was accidentally removed in the last version.
New floating point hasher - will hash the binary representation on more platforms, which should be faster.
Support the standard smart pointers.
hash_value now implemented using SFINAE to avoid implicit casts to built in types when calling it.
Updated to use the new config macros.
Ticket 6771: Avoid gcc’s-Wfloat-equal warning.
Ticket 6806: Supportstd::array andstd::tuple when available.
Add deprecation warning to the long deprecatedboost/container_hash/detail/container_fwd.hpp.
Avoid warning due with gcc’s-Wconversion flag.
Add option to prevent implicit conversions when callinghash_value by definingBOOST_HASH_NO_IMPLICIT_CASTS. When usingboost::hash for a type that does not havehash_value declared but does have an implicit conversion to a type that does, it would use that implicit conversion to hash it. Which can sometimes go very wrong, e.g. using a conversion tobool and only hashing to 2 possible values. Since fixing this is a breaking change and was only approached quite late in the release cycle with little discussion it’s opt-in for now. This, or something like it, will become the default in a future version.
Ticket 3866: Don’t foward declare containers when using gcc’s parallel library, allow user to stop forward declaration by defining theBOOST_DETAIL_NO_CONTAINER_FWD macro.
Ticket 4038: Avoid hashing0.5 and0 to the same number.
Stop using deprecatedBOOST_HAS_* macros.
Reduce the number of warnings for Visual C++ warning level 4.
Some code formatting changes to fit lines into 80 characters.
Rename an internal namespace.
Automatically configure thefloat functions using template metaprogramming instead of trying to configure every possibility manually.
Workaround for when STLport doesn’t support long double.
Move thehash_fwd.hpp implementation into the hash subdirectory, leaving a forwarding header in the old location. You should still use the old location, the new location is mainly for implementation and possible modularization.
Ticket 2412: Removed deprecated headers.
Ticket 2957: Fix configuration for vxworks.
Changed the warnings in the deprecated headers from 1.34.0 to errors. These will be removed in a future version of Boost.
Moved detail headers out ofboost/container_hash/detail, since they are part offunctional/hash, notcontainer_hash.boost/container_hash/detail/container_fwd.hpp has been moved toboost/detail/container_fwd.hpp as it’s used outside of this library, the others have been moved toboost/functional/hash/detail.
Ticket 2264: In Visual C++, always use C99 float functions for long double and float as the C++ overloads aren’t always availables.
Stop using OpenBSD’s dodgystd::numeric_limits.
Using the boost typedefs forlong long andunsigned long long.
Move the extensions into their own header.
Support forlong long,std::complex.
Improved algorithm for hashing floating point numbers:
Improved portablity, as described by Daniel Krügler ina post to the boost users list.
Fits more information into each combine loop, which can reduce the the number of times combine is called and hopefully give a better quality hash function.
Improved the algorithm for hashing floating point numbers.
On Cygwin use a binary hash function for floating point numbers, as Cygwin doesn’t have decent floating point functions forlong double.
Never usesfpclass which doesn’t supportlong double.
Ticket 1064: Removed unnecessary use of errno.
Explicitly overload for more built in types.
Minor improvements to the documentation.
A few bug and warning fixes:
Ticket 1509: Suppress another Visual C++ warning.
Some workarounds for the Sun compilers.
Ticket 952: Suppress incorrect 64-bit warning on Visual C++.
Use declarations for standard classes, so that the library doesn’t need to include all of their headers
Deprecated the<boost/functional/hash/*.hpp> headers. Now a single header,<boost/functional/hash.hpp> is used.
Add support for theBOOST_HASH_NO_EXTENSIONS macro, which disables the extensions to TR1.
Minor improvements to the hash functions for floating point numbers.
Update the portable example to hopefully be more generally portable.
Fixed the points example, as pointed out by 沈慧峰.
Initial Release
This documentation is
Copyright 2005-2008 Daniel James
Copyright 2022 Peter Dimov
and is distributed under theBoost Software License, Version 1.0.