Skip to main content

Turbo Hash Containers

Turbo provides a number of containers as alternatives to STL containers. These containers generally adhere to the properties of STL containers, though there are typically some associated API differences and/or implementation details that diverge from the standard library.

Turbo containers are designed to be more efficient in general cases; however, in certain scenarios, STL containers may perform better. Unlike some other abstractions provided by Turbo, these containers should not be considered direct drop-in replacements for their STL counterparts due to API and/or contract differences between the two sets of containers. For example, Turbo containers generally do not guarantee pointer stability1 after insertion or deletion.

The Turbo container library defines the following set of containers:

  • Swiss table unordered containers

See below for more information on each container type.

Hash Tables

The Turbo container library includes a number of useful hash tables that comply with the STL container API contract:

  • turbo::flat_hash_map
  • turbo::flat_hash_set
  • turbo::node_hash_map
  • turbo::node_hash_set

Collectively, these hash tables are known as "Swiss tables" and are designed to be replacements for std::unordered_map and std::unordered_set. They provide several advantages over the std::unordered_* containers:

  • Support for heterogeneous lookup.
  • Optimized emplace({key, value}) to avoid allocating a pair in most common cases.
  • Support for heterogeneous std::initializer_list to avoid extra copy construction and insertion.
  • Guaranteed "O(1)" erase methods by returning void instead of an iterator.

Construction

The Swiss table container family supports the same construction and assignment patterns as std::unordered_map:

// Examples using node_hash_set and node_hash_map are equivalent

// Default constructor
// No allocation for the table's elements is made.
turbo::flat_hash_set<std::string> set1;

turbo::flat_hash_map<int, std::string> map1;

// Initializer List constructor
turbo::flat_hash_set<std::string> set2 = {{"huey"}, {"dewey"}, {"louie"},};

turbo::flat_hash_map<int, std::string> map2 =
{{1, "huey"}, {2, "dewey"}, {3, "louie"},};

// Copy constructor
turbo::flat_hash_set<std::string> set3(set2);

turbo::flat_hash_map<int, std::string> map3(map2);

// Copy assignment operator
// Hash functor and Comparator are copied as well
turbo::flat_hash_set<std::string> set4;
set4 = set3;

turbo::flat_hash_map<int, std::string> map4;
map4 = map3;

// Move constructor
// Move is guaranteed efficient
turbo::flat_hash_set<std::string> set5(std::move(set4));

turbo::flat_hash_map<int, std::string> map5(std::move(map4));

// Move assignment operator
// May be efficient if allocators are compatible
turbo::flat_hash_set<std::string> set6;
set6 = std::move(set5);

turbo::flat_hash_map<int, std::string> map6;
map6 = std::move(map5);

// Range constructor
std::vector<std::string> v = {"a", "b"};
turbo::flat_hash_set<std::string> set7(v.begin(), v.end());

std::vector<std::pair<int, std::string>> v = {{1, "a"}, {2, "b"}};
turbo::flat_hash_map<int, std::string> map7(v.begin(), v.end());

turbo::flat_hash_map and turbo::flat_hash_set

turbo::flat_hash_map and turbo::flat_hash_set are the recommended unordered containers for general-purpose use. These are flat data structures that store their value_type directly within a slot array.

Guarantees

  • Keys and values are stored inline.
  • Iterators, references, and pointers to elements are invalidated on rehashing.
  • Move operations do not invalidate iterators or pointers.

Memory Usage

The container uses O((sizeof(std::pair<const K, V>) + 1) * bucket_count()) bytes. The maximum load factor is 87.5%, after which the table's size is doubled (reducing the load factor by a factor of 2). Thus size() typically ranges between 0.4375*bucket_count() and 0.875*bucket_count(). For tables that have never been rehashed, the load factor may be lower, but these numbers are sufficient for our estimates.

Usage Recommendations

Use turbo::flat_hash_map for most use cases. If pointer stability for values (but not keys) is needed, use turbo::flat_hash_map<Key, std::unique_ptr<Value>>.

turbo::node_hash_map and turbo::node_hash_set

These are nearly direct drop-in replacements for std::unordered_map and std::unordered_set. They are useful:

  • When pointer stability for both keys and values is required.
  • For automated migration from std::unordered_map, std::unordered_set, hash_map, or hash_set where it is difficult to determine if the code relies on pointer stability.

These are node-based data structures in the STL standard sense: each value_type is allocated in a separate node, and the main table contains pointers to these nodes.

Guarantees

  • Nodes have stable addresses.
  • Iterators are invalidated on rehashing.
  • Move operations do not invalidate iterators.

Memory Usage

The slot array requires (sizeof(void*) + 1) * bucket_count() bytes, and the nodes themselves require sizeof(value_type) * size() bytes. Combined, this is O(9*bucket_count() + sizeof(std::pair<const K, V>)*size()) on most platforms.

Usage Recommendations

Prefer turbo::flat_hash_map or turbo::flat_hash_set in most new code (see above).

Use turbo::node_hash_map or turbo::node_hash_set when pointer stability for both keys and values is required (rare), or for migrating code from other containers that have this property. Note: Do not use popularity as a guide. You will see node containers used extensively, but this is only because they are a safe target for migrating code from other containers.

Usage

turbo::flat_hash_map<int, string> numbers =
{{1, "one"}, {2, "two"}, {3, "three"}};
numbers.try_emplace(4, "four");

turbo::flat_hash_map<string, std::unique_ptr<string>> strings;
strings.try_emplace("foo", turbo::make_unique<string>("bar"));

Heterogeneous Lookup

Inserting or looking up elements in an associative container requires a key. Generally, containers require the key to be of a specific type, which can lead to inefficient call sites that need to convert between nearly equivalent types (e.g., std::string and turbo::string_view).

std::map<std::string, int> m = ...;
turbo::string_view some_key = ...;
// Construct a temporary `std::string` to do the query.
// The allocation + copy + deallocation might dominate the find() call.
auto it = m.find(std::string(some_key));

To avoid this unnecessary work, Swiss tables provide heterogeneous lookup for conversions to string types (allowing the use of turbo::string_view in lookups, for example), as well as conversions to smart pointer types (std::unique_ptr, std::shared_ptr), via the turbo::Hash hashing framework. (A matching comparator is built into turbo::Hash.)

turbo::flat_hash_map<std::string, int> m = ...;
turbo::string_view some_key = ...;
// We can use string_view directly as the key for the search.
auto it = m.find(some_key);

Unstable Iteration Order

While std::unordered_map does not guarantee iteration order, many implementations happen to have a deterministic order based on keys and their insertion order. This is not the case for turbo::flat_hash_map and turbo::node_hash_map. As a result, converting from std::unordered_map to turbo::flat_hash_map may expose latent bugs where code incorrectly relies on iteration order.

A specific case that can produce subtle bugs is summing float values in an unordered container. While mathematical summation is order-independent, floating-point summation is not, and cases may arise where the sum is deterministic with std::unordered_set but non-deterministic with turbo::flat_hash_set.

Notes

Footnotes

  1. "Pointer stability" means that a pointer to an element remains valid (does not become invalidated) as long as the element exists, allowing code to cache pointers to elements even as the underlying container changes. To say a container has pointer stability is equivalent to saying it does not move elements in memory; their addresses do not change. Pointer stability/invalidation is the same as reference stability/invalidation.