146 lines
No EOL
4.3 KiB
Text
146 lines
No EOL
4.3 KiB
Text
---
|
|
id: faster-hashmap
|
|
title: A Faster Hash Table in C++
|
|
author: Benjamin Qi
|
|
description: "Introduces gp_hash_table."
|
|
frequency: 1
|
|
prerequisites:
|
|
- unordered
|
|
---
|
|
|
|
import { Problem } from "../models";
|
|
|
|
export const metadata = {
|
|
problems: {
|
|
three: [
|
|
new Problem("Gold", "3SUM", "994", "Normal", false, [], ""),
|
|
],
|
|
}
|
|
};
|
|
|
|
<Resources>
|
|
<Resource source="CF" url="https://codeforces.com/blog/entry/60737" title="Chilli - Order of magnitude faster hash tables" starred>Introduces gp_hash_table</Resource>
|
|
<Resource source="GCC" url="https://gcc.gnu.org/onlinedocs/libstdc++/ext/pb_ds/gp_hash_table.html#Resize_Policy566860465" title="gp_hash_table Interface">documentation</Resource>
|
|
<Resource source="Benq (from KACTL)" title="HashMap" url="https://github.com/bqi343/USACO/blob/master/Implementations/content/data-structures/STL%20(5)/HashMap.h" starred> </Resource>
|
|
</Resources>
|
|
|
|
<br />
|
|
|
|
Read / writes are much faster than `unordered_map`. Its actual size is always a power of 2. The documentation is rather confusing, so I'll just summarize the most useful functions here.
|
|
|
|
```cpp
|
|
#include <ext/pb_ds/assoc_container.hpp>
|
|
using namespace __gnu_pbds;
|
|
```
|
|
|
|
## Unordered Set
|
|
|
|
`gp_hash_table<K,null_type>` functions similarly to `unordered_set<K>`.
|
|
|
|
## Hacking
|
|
|
|
`gp_hash_table` is also vulnerable to hacking.
|
|
|
|
(example of hash function that fails)
|
|
|
|
To avoid this, we can use one of the custom hash functions mentioned in the Bronze module.
|
|
|
|
```cpp
|
|
template<class K,class V> using ht = gp_hash_table<K,V,chash>;
|
|
```
|
|
|
|
## Resizing
|
|
|
|
Unordered map has [`reserve`](http://www.cplusplus.com/reference/unordered_map/unordered_map/reserve/). Calling this function before inserting any elements can result in a constant factor speedup.
|
|
|
|
We can modify the declaration of `gp_hash_table` so that it supports the `resize` function, which operates similarly.
|
|
|
|
```cpp
|
|
template<class K,class V> using ht = gp_hash_table<
|
|
K,
|
|
null_type,
|
|
hash<K>,
|
|
equal_to<K>,
|
|
direct_mask_range_hashing<>,
|
|
linear_probe_fn<>,
|
|
hash_standard_resize_policy<
|
|
hash_exponential_size_policy<>,
|
|
hash_load_check_resize_trigger<>,
|
|
true
|
|
>
|
|
>;
|
|
```
|
|
|
|
These are the same template arguments as the default `gp_hash_table`, except `false` has been changed to `true`. This modification allows us to change the actual size of the hash table.
|
|
|
|
```cpp
|
|
int main() {
|
|
ht<int,null_type> g; g.resize(5);
|
|
cout << g.get_actual_size() << "\n"; // 8
|
|
cout << g.size() << "\n"; // 0
|
|
}
|
|
```
|
|
|
|
When calling `g.resize(x)`, `x` is rounded up to the nearest power of 2. Then the actual size of `g` is changed to be equal to `x` (unless `x < g.size()`, in which case an error is thrown).
|
|
|
|
<Resources>
|
|
<Resource source="GCC" url="https://gcc.gnu.org/onlinedocs/libstdc++/ext/pb_ds/hash_standard_resize_policy.html" title="Resize Policy">documentation</Resource>
|
|
</Resources>
|
|
|
|
Furthermore, if we construct `g` with the following arguments:
|
|
|
|
```cpp
|
|
ht<int,null_type> g({},{},{},{},{1<<16});
|
|
```
|
|
|
|
then the actual size of `g` is always at least `1<<16` (regardless of calls to `resize`). The last argument **must** be a power of 2 (or else errors will be thrown).
|
|
|
|
### Solving ThreeSum
|
|
|
|
<Problems problems={metadata.problems.three} />
|
|
|
|
You're supposed to use array since values are small :|
|
|
|
|
```cpp
|
|
#include <bits/stdc++.h>
|
|
using namespace std;
|
|
|
|
void setIO(string name) {
|
|
ios_base::sync_with_stdio(0); cin.tie(0);
|
|
freopen((name+".in").c_str(),"r",stdin);
|
|
freopen((name+".out").c_str(),"w",stdout);
|
|
}
|
|
|
|
|
|
#include <ext/pb_ds/assoc_container.hpp>
|
|
using namespace __gnu_pbds;
|
|
|
|
int N,Q;
|
|
long long ans[5000][5000];
|
|
vector<int> A;
|
|
|
|
int main() {
|
|
setIO("threesum");
|
|
cin >> N >> Q;
|
|
A.resize(N); for (int i = 0; i < N; ++i) cin >> A[i];
|
|
for (int i = 0; i < N; ++i) {
|
|
gp_hash_table<int,int> g({},{},{},{},{1<<13});
|
|
// initialize with certain capacity, must be power of 2
|
|
for (int j = i+1; j < N; ++j) {
|
|
int res = -A[i]-A[j];
|
|
auto it = g.find(res);
|
|
if (it != end(g)) ans[i][j] = it->second;
|
|
g[A[j]] ++;
|
|
}
|
|
}
|
|
for (int i = N-1; i >= 0; --i) for (int j = i+1; j < N; ++j)
|
|
ans[i][j] += ans[i+1][j]+ans[i][j-1]-ans[i+1][j-1];
|
|
for (int i = 0; i < Q; ++i) {
|
|
int a,b; cin >> a >> b;
|
|
cout << ans[a-1][b-1] << "\n";
|
|
}
|
|
// you should actually read the stuff at the bottom
|
|
}
|
|
```
|
|
|
|
<IncompleteSection /> |