C++ Hashmap Benchmark

The latest hashmap performance comparison results have been released, courtesy of Martin Leitner-Ankerl, the developer of ankerl::unordered_dense::map. A previous round of testing was conducted in 2019, and this update provides comprehensive test data, serving as a valuable reference for selecting high-performance hashmaps.

I. Hashmap Performance Comparison

Test Dimensions

Data Types: Divided into two categories: Numbers (numeric types) and Strings (string types).
Test Operations: Covers common operations such as Insert, Random Insert, Iterate, Find, and Copy.
Specialized Find Tests: Evaluates performance under different selection rates and insert-find frequency ratios.
Additional Comparison: Includes memory usage analysis, with overall results calculated using arithmetic mean.

Core Test Results

ankerl::unordered_dense::map: Delivers the best overall performance (top arithmetic mean across all operations). It boasts unrivaled performance in Copy and Iterate operations, with excellent Find performance. It adapts well to different hash functions with minimal performance variance. Inherited from robin_hood (by the same author), it outperforms the robin_hood series.
gtl::flat_hash_map / gtl::parallel_flat_hash_map: Commonly known in the open-source community as phmap::flat_hash_map and phmap::parallel_flat_hash_map (for detailed introduction to phmap, refer to: https://byronhe.com/post/2020/11/10/parallel-hashmap-btree-fast-multi-thread-intro/). gtl::parallel_flat_hash_map is the parallel variant, derived from Google Abseil’s absl::flat_hash_map, offering excellent performance. The repository includes both btree and hash container implementations, provided as header-only for easy integration. gtl::flat_hash_map achieved the best performance in the "Find 1 – 500k uint64_t" test set.
emhash8::HashMap / emhash7::HashMap: Rank second only to the ankerl series, demonstrating outstanding performance.
folly::F14NodeMap / folly::F14ValueMap: Delivers excellent performance but comes with heavy dependencies on the folly library. Recommended if folly is already integrated into the project.
absl::flat_hash_map: Implemented by Google Abseil, it belongs to the top performance tier. It excels in handling large maps and string lookup scenarios but performs relatively slower in Copy and Iterate operations.
boost::unordered_map: Offers lookup performance twice as fast as std::unordered_map, making it a direct replacement for the standard library version. However, it has higher memory usage and average insert performance.
std::unordered_map: Ranks the lowest in performance, serving only as a baseline reference.

II. Hash Function Comparison

Hash function tests cover two scenarios: Integral types and String types, with core conclusions as follows:

Integral Types Scenario:
- std::hash and boost::hash directly use the original value as the hash result, leading to highly variable performance dependent on input data—not recommended.
- ankerl::unordered_dense::hash: Similar in logic to absl::hash (both use wyhash), but differs in integral type processing (applies XOR to input), delivering excellent performance.
- robin_hood::hash: Based on murmurhash3, offering solid performance.
- mumx: Adopts the same integral type processing as ankerl::unordered_dense::hash, ensuring stable performance.
String Types Scenario:
- std::hash: Delivers good performance—recommended.
- boost::hash: Performs poorly—not recommended.
- ankerl::unordered_dense::hash/absl::hash: Both based on wyhash, achieving excellent performance.
- robin_hood::hash: Based on MurmurHash2, with good performance.
- mumx: Uses the same string processing as std::hash, ensuring strong compatibility.

All conclusions above are referenced from: https://martin.ankerl.com/2022/08/27/hashmap-bench-01/#ankerl__unordered_dense__map

III. In-depth Analysis of Top-Performing Containers

Both containers below are provided as header-only, enabling direct integration as third-party headers with strong adaptability.

(I) Ankerl::unordered_dense::map

GitHub Repository: https://github.com/martinus/unordered_dense
Core Implementation: Based on Robin hood probing.

1. Design Architecture

Hash Function: Uses ankerl::unordered_dense::hash; string types default to wyhash for high execution efficiency.
Core Structures: Consists of two key arrays—
- Vector Array: Stores pair{key, value} key-value pairs.
- Bucket Array: A collision resolution solution tailored for Robin hood hashing, with the following structure:
```
struct Bucket {
    uint32_t dist_and_fingerprint; // Top 3 bytes store distance, bottom 1 byte stores fingerprint
    uint32_t value_idx; // Stores the index of data in the vector array
};
```
Key Optimization: The Bucket array is partially sorted (in descending order) and features a contiguous memory layout, ensuring excellent cache-locality.

2. Core Operation Implementations

Insert:
1. Compute the hash value of the key and shift to get the bucket_index.
2. Insert directly if the target bucket is empty.
3. If a collision occurs, compare dist_and_fingerprint and place the element with the larger distance first.
4. Move bucket_index backward sequentially and update dist_and_fingerprint ( dist_and_fingerprint + 1UL<<8 ).
5. Insert data into the vector array first, then update value_idx in the Bucket array (Bucket indices are relative positions).
Find:
1. Compute the hash value of the key and shift according to the Bucket array length to get the bucket_index.
2. Compare dist_and_fingerprint of the target bucket; if matched, retrieve the key from the vector via value_idx and compare.
3. Return the corresponding value if keys match; otherwise, move backward to continue searching until dist_and_fingerprint > bucket.dist_and_fingerprint.
4. Core Advantage: Find operations only require a limited sequential search of the Bucket array, fully leveraging cache locality for excellent performance.
Iterate: Simply traverse the vector array—logically simple with exceptional performance.

(II) emhash8::HashMap

GitHub Repository: https://github.com/ktprime/emhash
Core Positioning: The "masterpiece" of linear probing.

1. Design Architecture

Hash Function: Uses murmurhash3mixer for integral types and wyhash for string types.
Key Optimization: Traditional open addressing is prone to Primary Clustering and Secondary Clustering. emhash8 resolves this via "3-way combined probing" technology, maintaining high performance even with a load factor > 0.9.

2. Key Design Details

Core Data Structures:

struct Index
{
    size_type bucket; // Stores collision position
    size_type slot;   // Stores actual data position
};
typedef std::pair<KeyT, ValueT> value_type; // Key-value pair storage structure

3-way Combined Probing:
1. Within 2-3 CPU cache lines: Use linear probing.
2. Beyond cache lines: Use limited quadratic probing.
3. Record the last empty position _last, and calculate the empty slot via the formula (_mask / 2 + _last++) & _mask.
Smart Collision Resolution: Collision nodes are linked via the bucket field in Index ( bucket stores the index of the next collision node), fundamentally avoiding performance degradation caused by clustering.

I. Hashmap Performance Comparison​

Test Dimensions​

Core Test Results​

II. Hash Function Comparison​

III. In-depth Analysis of Top-Performing Containers​

(I) Ankerl::unordered_dense::map​

1. Design Architecture​

2. Core Operation Implementations​

(II) emhash8::HashMap​

1. Design Architecture​

2. Key Design Details​

I. Hashmap Performance Comparison

Test Dimensions

Core Test Results

II. Hash Function Comparison

III. In-depth Analysis of Top-Performing Containers

(I) Ankerl::unordered_dense::map

1. Design Architecture

2. Core Operation Implementations

(II) emhash8::HashMap

1. Design Architecture

2. Key Design Details