Introduction
Associative containers in C++ are containers that allows fast retrieval of elements based on keys, rather than positions.
- Construction:
std::set
, andstd::map
also have sorting using comparison.std::unordered_set
, andunordered_map
uses a hash function. (O(log n)) - Look up:
std::set
, andstd::map
uses a red-black tree.std::unordered_set
, andunordered_map
uses a hash table (O(1)) - Duplicates: No duplicates are allowed in associative containers except for
std::multiset
,std::unordered_multiset
, andstd::unordered_multimap
In this article, we will examine the below behaviours:
- Insert, assign, emplace
- Hashing
- Remove
- Element Traversal and access
Finally, we will see how to define hash functions for these associative container datatypes.
std::unordered_map
- Insert Or Assign (C++17)
1
2
3
4
5
6
7
8
9
10
#include <iostream>
#include <unordered_map>
int main() {
std::unordered_map<int, std::string> my_map;
my_map.insert_or_assign(1, "apple"); // Insert key 1
my_map.insert_or_assign(2, "banana"); // Insert key 2
my_map.insert_or_assign(1, "avocado"); // Assign new value to key 1
std::cout << "After: " << my_map[1] << "\n"; // Outputs: avocado
}
std::unordered_set
Under the hood, unordered set
is a hash table, it can only store objects with a hash function and equality operator. Their values will be stored as keys. unordered_map
is a hash-table plus a look up.
- Insert: In set and unordered_set, if an element already exists, insert does not do anything:
1
2
3
4
5
6
7
8
9
#include <iostream>
#include <set>
int main() {
std::set<int> my_set;
auto result1 = my_set.insert(42); // First insert
auto result2 = my_set.insert(42); // Second insert
std::cout << "Set size: " << my_set.size() << "\n"; // 1
}
std::set
-
- Insert: In set and unordered_set, if an element already exists, insert does not do anything. See
std::unordered_set
for code snippet.
- Insert: In set and unordered_set, if an element already exists, insert does not do anything. See
std::map
- Erase:
map
,unordered_map
,set
,unordered_set
have:container.erase(value)
to erase all occurences ofvalue
container.erase(it)
to erase the single occurence of it
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
//
std::map<int, std::string> m = {{1, "one"}, {2, "two"}};
m.erase(1); // Removes key , o(log n)
std::unordered_map<int, std::string> umap = {{1, "one"}, {2, "two"}};
umap.erase(1); // Removes key , o(1)
std::set<int> s = {1, 2, 3};
s.erase(2); // Removes element with value 2, o(log n)
std::unordered_set<int> uset = {1, 2, 3};
uset.erase(2); // Removes element with value 2 o(1)
// Alternative 2: universal erase with iterator:
auto it = my_map.find(key);
if (it != my_map.end()) {
my_map.erase(it); // faster than key lookup + erase
}
std::unordered_map
Emplace
unordered_map::emplace() -> pair [iterator, bool]
vstry_emplace() -> pair [iterator, bool]
:- If a key already exists, neither will replace the associated value.
insert_or_assign()
would.
- However,
try_emplace()
is always preferred overemplace
, because:try_emplace()
Only constructs the value if the key doesn’t exist. (C++17)emplace(key, value)
Always constructs the key and value (even if the key already exists — it won’t be inserted though).
- Another subtle difference is:
piecewise_construct
is to construct either value of a pair using a tuple of args.try_emplace
does not take in args for key (so key must be fully constructed before the call)- but it can take in args of the value, so value is constructed in-place, but key will have to be move-constructed / copy constructed in the container.
1 2 3 4 5 6 7 8 9 10 11
// emplace supports piecewise construct itr_lookup_.emplace( std::piecewise_construct, std::forward_as_tuple(cache_.begin()->first), std::forward_as_tuple(cache_.begin())); // try emplace only supports construct + move? itr_lookup_.try_emplace( list_it->first, // key list_it // mapped_type must be constructible from this iterator );
- If a key already exists, neither will replace the associated value.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <iostream>
#include <unordered_map>
#include <string>
int main()
{
std::unordered_map<std::string, std::string> my_map;
std::string expensive = "expensive";
// emplace will always create the pair, even if key exists
auto it = my_map.emplace("dog", expensive);
// try_emplace avoids creating the value if "dog" is already in map
my_map.try_emplace("dog", "dog2"); // Only constructs the value if the key doesn’t exist.
// always construct the key-value pair, but if the object exists, the pair won't be inserted
my_map.emplace("dog", "expensivo");
std::cout<<my_map["dog"]<<std::endl; // see expensive
std::cout<<(it.first->second)<<std::endl; // see expensive
return 0;
}
Hashing
By default, keys of types std::pair<int, int>
are not hashable. One needs to define a callable for that.
- The first alternative is defining a functor
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#include <iostream>
#include <unordered_map>
#include <utility>
struct MyHash{
size_t operator()(const std::pair<size_t, size_t>& p) const {
return std::hash<size_t>{}(p.first) ^ (std::hash<size_t>{}(p.second) << 1);
}
};
int main() {
std::unordered_map<std::pair<size_t, size_t>, std::string, MyHash> my_map;
my_map[{1, 2}] = "one-two";
my_map[{3, 4}] = "three-four";
std::cout << "Value at (1, 2): " << my_map[{1, 2}] << "\n";
std::cout << "Value at (3, 4): " << my_map[{3, 4}] << "\n";
return 0;
}
- Another alternative is defining a global hash function. It’s a bit intrusive
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#include <unordered_map>
#include <utility>
#include <iostream>
namespace std{
template<>
struct hash <std::pair<size_t, size_t>>{
size_t operator()(const std::pair<size_t, size_t>& p) const {
return std::hash<size_t>{}(p.first) ^ (std::hash<size_t>{}(p.second) << 1);
}
};
}
int main() {
std::unordered_map<std::pair<size_t, size_t>, std::string> my_map;
my_map[{7, 8}] = "seven-eight";
std::cout << my_map[{7, 8}] << std::endl;
my_map[{1, 2}] = "one-two";
my_map[{3, 4}] = "three-four";
std::cout << "Value at (1, 2): " << my_map[{1, 2}] << "\n";
std::cout << "Value at (3, 4): " << my_map[{3, 4}] << "\n";
return 0;
}
PMR: Polymorphic Memory Resources (C++17)
- Custom Allocators: Instead of having containers decide how memory is allocated, PMR allows you to supply a custom memory resource (allocator) that the container uses.
- Reusable Strategies: You can create memory resources that implement different allocation strategies, such as pooling, monotonic allocation, or synchronized (thread-safe) allocation, and then reuse them across multiple containers.
- Memory Efficiency: PMR can help you pre-allocate a large chunk of memory (from a buffer) and then use it for many small allocations, which is useful in real-time or embedded systems.
- Multiple PMR-based containers can share the same memory resource.
- memory allocation strategies by using user-supplied memory resources.
Thread-Safety of Associative Containers
Can I do std::for_each(parallel, )
, grab each deque, do stuff, then put in grid_? Will there be memory issues?
Parallel access to map
, unordered_map
is not inherantly thread safe.
1
2
3
4
5
6
7
8
9
std::mutex mtx
std::for_each(
std::execution::par,
my_map.begin(), my_map.end(), [&](auto& e){
...
std::lock_guard<std::mutex> lg(mtx);
my_map[entry] = ... // potential memory re-allocation, etc.
}
);
Read-only access is fine
If there is no memory re-allocation, AND one element is only accessed by a single thread, lockless is fine. These operators could trigger memory re-allocation:
reserve()
insert()
-
operator[]
- Alternatives:
- TBB’s (Thread-Building-Blocks)
concurrent_unordered_map
. TODO: test it
- TBB’s (Thread-Building-Blocks)