Hash Function(s) in C++

Class std::hash<> and its Member std::hash<>::hash()*

(From Geeks for Geeks)

https://www.geeksforgeeks.org/cpp/stdhash-class-in-c-stl/

In C++, the hash class is default constructible function class (functor) that provides the default hash function used by STL. It is used to get the hash value of the argument that is being passed to it. If the argument doesn't change, the value doesn't change either.

Let's take a quick look at an example:

#include <bits/stdc++.h>
using namespace std;

int main() {
    string s = "geeksforgeeks";

    hash<string> h;

    // Get the hash value of character
    cout << h(s);

    return 0;

Output

5146686323530302118

Syntax of hash Function Class

The hash function class is defined as std::hash class template in <functional> header file.

hash<T> h;      // Create hash function for type T
h(val);               // Generate the hash value

where, h is the name of object by which we can access the member function of hash class.

Parameters:

  • val: Data whose hash value to be generate.
  • T: Type of val.

Return Value: a hash value of given arguments.

Examples of hash Function Class

The following examples demonstrates the use of hash class in different scenarios:

Generate the Hash Value of a Bitset

#include <bits/stdc++.h>
using namespace std; ​

int main() {

    bitset<5> b("10101"); ​

    // Create an object of hash class
    hash<bitset<5>> h; ​

    // Generate the hash value of given bitset
    cout << h(b);

    return 0;

}

Output

17123654466142159564

Generate the Hash Value of Vector

#include <bits/stdc++.h>
using namespace std; ​

int main() {

    vector<bool> v = {true, false}; ​

    // Create an object of hash class
    hash<vector<bool>> h; ​

    // Generate the hash value of given vector
    cout << h(v) << endl;

    return 0;
}

Output:

4334672815104069193

Generate the Hash Value of Custom Data Type

To generate hash value for custom data type, a hash function for that type has to be defined manually. It can be done by specializing the hash class template to the given data type.

#include <bits/stdc++.h>
using namespace std; ​

struct A {
  int a;
  A(int x = 0): a(x) {}
}; ​

// Specialization of hash for A type
namespace std {
  template <>
  struct hash<A> {
    size_t operator()(const A& p) const {
      return hash<int>{}(p.a);
    }
  };
} ​

int main() {

  A obj(11);

  // Creating hash function class instance for
  // A type:
  hash<A> h;

  cout << h(obj);

  return 0;

}

Output

11


            


            


            


            


            


          


          


          


          


          


          


          


          

Custom Hash Functions for C++ Unordered Containers

Motivation and Background

C++ unordered containers (e.g. unordered_map, unordered_set, etc.) uses hashing to store objects. The STL provides hash functions for commonly used types, like string and numeric values such as int, meaning that we won’t have to provide any hash functions explicitly when creating an unordered container instance.

However, when we want to hold a more complex type, or even a user-defined class, C++ unordered containers would fail to find a matching hash function. We will have to explicitly define a hash function for the type (e.g. pair<string, int>) that we want to use as a template argument for some unordered container.

Implementation Idea: Exclusive-Or

According to the book A Tour of C++ by Bjarne Stroustrup, A hash function is often provided as a function object. And creating a new hash function by combining existing functions using exclusive-or (^) is simple and often very effective.

Intuitively, this means that if we want to hash a pair of string and integer, we could use the following new hash function:

// p is a pair<string, int>
hash<string>{}(p.first) ^ hash<int>{}(p.second);

The next thing we need to do is to pass that as a template argument when creating our unordered container. For an unordered_set, that will be the second argument. For an unordered_map, that'll be the third argument after the key type and the value type. Note that we only need the hashing function for a map's key, not the value.

Method 1: Function Object with Struct

Perhaps the most commonly used method is to create a function object (functor) using a struct, and implement its operator() function. This function is to be qualified as const, and takes a const reference to the custom type and returns a size_t. Defining a function object outside as a standalone struct also gives it the flexibility of generic programming with templates. Meaning that it can work with pairs of different template arguments, not just strings and ints.

Putting this all together, we have:

struct PairHash {
  template <typename T1, typename T2>
  auto operator()(const pair<T1, T2> &p) const -> size_t {
    return hash<T1>{}(p.first) ^ hash<T2>{}(p.second);
  }
};

class Example {
 public:
  void func() {
    unordered_set<pair<string, int>, PairHash> uset;
    unordered_map<pair<double, string>, int , PairHash> umap;
  }
};

Method 2: Custom Lambda Hash Function

Alternatively, we can also implement our custom hash function as a lambda. Some may find this inline implementation more elegant. A drawback is that C++11 lambdas do not support templates. Fortunately, we have “generic lambdas” since C++14 (as seen below).

Another drawback is that we need to use decltype of the lambda as the extra custom hash function argument for our unordered containers, and also provide constructor arguments in parentheses after the declared variable name. Both unordered_set and unordered_map have a constructor that takes an initial number of buckets and a hashing object as inputs. Here, I just arbitrarily let the initial buckets be 10.

class Example {
 public:
  void func() {
    auto pairHash = []<typename T1, typename T2>(const pair<T1, T2> &p) {
      return hash<T1>{}(p.first) ^ hash<T2>{}(p.second);
    };
    unordered_set<pair<int, int>, decltype(pairHash)> seen(10, pairHash);
    unordered_map<pair<double, string>, int , decltype(pairHash)> umap(10, pairHash);
  }
};

Method 3: Defining a Specialization of std::hash()

In this method, we add a specialization of standard-library's hash function to the namespace std. This is quite similar to the first method, except now we must use hash as the name of our function object. We also have to specify the template class for which we are defining the specialization. The immediate advantage is that we do not need to pass in any extra template arguments to our unordered container when declaring an instance.

Here is an example of this method, suppose we want to be able to hash a user-defined class XYZ, which has a member called value:

namespace std {
template <>
struct hash<XYZ> {
  auto operator()(const XYZ &xyz) const -> size_t {
    return hash<XYZ>{}(xyz.value);
  }
};
}  // namespace std

unordered_set<XYZ> xset;

Overloading operator==()

Note that the unordered containers also need the ability to compare two different keys using ==. In previous examples, I have been using std::pair, and the STL already defines operator==() for pairs, so I didn't have to define my own equality logic. However, if we are using our own user-defined class, then we need to define the equality operator.