This repository has been archived on 2022-06-22. You can view files and clone it, but cannot push or open issues or pull requests.
usaco-guide/content/3_Bronze/Unordered.mdx
2020-07-10 17:56:07 -04:00

149 lines
6.6 KiB
Text

---
id: unordered
title: Unordered Maps & Sets
author: Darren Yao
description: "?"
frequency: 2
---
import { Problem } from "../models";
export const metadata = {
problems: {
ex: [
new Problem("YS", "Associative Array", "associative_array", "Easy"),
],
standard: [
new Problem("CSES", "Distinct Numbers", "1621", "Easy"),
new Problem("CSES", "Sum of Two Values", "1640", "Easy", false, [], "Can be solved without sets."),
new Problem("Bronze", "Where Am I?", "964", "Easy", false, [], ""),
new Problem("Silver", "Cities & States", "667", "Hard", false, [], ""),
],
}
};
## Sets and Maps
A **set** is a collection of objects that contains no duplicates. A **map** is a set of ordered pairs, each containing a key and a value. In a map, all keys are required to be unique, but values can be repeated. Maps have three primary methods: one to add a specified key-value pairing, one to retrieve the value for a given key, and one to remove a key-value pairing from the map. Like sets, maps can be unordered or ordered.
Both Java and C++ contain two versions of sets and maps; one in which the keys are stored in sorted order, and one in which **hashing** is used. Bronze problems shouldn't distinguish between the two, so we'll cover only the latter in this module.
## Hashing
**Hashing** refers to assigning a unique code to every variable/object which allows insertions, deletions, and searches in $O(1)$ time, albeit with a high constant factor, as hashing requires a large constant number of operations. However, as the name implies, elements are not ordered in any meaningful way, so traversals of an unordered set will return elements in some arbitrary order.
(more in-depth explanation?)
## HashSets
### C++
The operations on an unordered set are `insert`, which adds an element to the set if not already present, `erase`, which deletes an element if it exists, and `count`, which returns `1` if the set contains the element and `0` if it doesn't.
```cpp
unordered_set<int> s;
s.insert(1); // [1]
s.insert(4); // [1, 4] in arbitrary order
s.insert(2); // [1, 4, 2] in arbitrary order
s.insert(1); // [1, 4, 2] in arbitrary order
// the add method did nothing because 1 was already in the set
cout << s.count(1) << endl; // 1
set.erase(1); // [2, 4] in arbitrary order
cout << s.count(5) << endl; // 0
s.erase(0); // [2, 4] in arbitrary order
// if the element to be removed does not exist, nothing happens
for(int element : s){
cout << element << " ";
}
cout << endl;
// You can iterate through an unordered set, but it will do so in arbitrary order
```
### Java
The operations on an unordered set are `add`, which adds an element to the set if not already present, `remove`, which deletes an element if it exists, and `contains`, which checks whether the set contains that element.
```java
HashSet<Integer> set = new HashSet<Integer>();
set.add(1); // [1]
set.add(4); // [1, 4] in arbitrary order
set.add(2); // [1, 4, 2] in arbitrary order
set.add(1); // [1, 4, 2] in arbitrary order
// the add method did nothing because 1 was already in the set
System.out.println(set.contains(1)); // true
set.remove(1); // [2, 4] in arbitrary order
System.out.println(set.contains(5)); // false
set.remove(0); // [2, 4] in arbitrary order
// if the element to be removed does not exist, nothing happens
for(int element : set){
System.out.println(element);
}
// You can iterate through an unordered set, but it will do so in arbitrary order
```
## HashMaps
<problems-list problems={metadata.problems.ex} />
### C++
In an unordered map `m`, the `m[key] = value` operator assigns a value to a key and places the key and value pair into the map. The operator `m[key]` returns the value associated with the key. If the key is not present in the map, then `m[key]` is set to 0. The `count(key)` method returns the number of times the key is in the map (which is either one or zero), and therefore checks whether a key exists in the map. Lastly, `erase(key)` and `erase(it)` removes the map entry associated with the specified key or iterator. All of these operations are $O(1)$, but again, due to the hashing, this has a high constant factor.
```cpp
unordered_map<int, int> m;
m[1] = 5; // [(1, 5)]
m[3] = 14; // [(1, 5); (3, 14)]
m[2] = 7; // [(1, 5); (3, 14); (2, 7)]
m.erase(2); // [(1, 5); (3, 14)]
cout << m[1] << '\n'; // 5
cout << m.count(7) << '\n' ; // 0
cout << m.count(1) << '\n' ; // 1
```
### Java
In the unordered map, the `put(key, value)` method assigns a value to a key and places the key and value pair into the map. The `get(key)` method returns the value associated with the key. The `containsKey(key)` method checks whether a key exists in the map. Lastly, `remove(key)` removes the map entry associated with the specified key. All of these operations are $O(1)$, but again, due to the hashing, this has a high constant factor.
```java
HashMap<Integer, Integer> map = new HashMap<Integer, Integer>();
map.put(1, 5); // [(1, 5)]
map.put(3, 14); // [(1, 5); (3, 14)]
map.put(2, 7); // [(1, 5); (3, 14); (2, 7)]
map.remove(2); // [(1, 5); (3, 14)]
System.out.println(map.get(1)); // 5
System.out.println(map.containsKey(7)); // false
System.out.println(map.containsKey(1)); // true
```
## Hacking
In USACO contests, unordered sets and maps generally fine, but the built-in hashing algorithm for C++ is vulnerable to pathological data sets causing abnormally slow runtimes. Apparently [Java](https://codeforces.com/blog/entry/62393?#comment-464875) is not vulnerable to this, however.
<resources>
<resource source="Mark Nelson" title="Hash Functions for C++ Unordered Containers" url="https://marknelson.us/posts/2011/09/03/hash-functions-for-c-unordered-containers.html" starred>How to create user-defined hash function for `unordered_map`.</resource>
<resource title="Blowing up Unordered Map" source="CF" url="blog/entry/62393" starred>Explanation of this problem and how to fix it.</resource>
</resources>
### Implementation
<resources>
<resource source="Benq (from KACTL)" title="HashMap" url="https://github.com/bqi343/USACO/blob/master/Implementations/content/data-structures/STL%20(5)/HashMap.h" starred> </resource>
</resources>
```cpp
struct chash { /// use most bits rather than just the lowest ones
const uint64_t C = ll(2e18*PI)+71; // large odd number
const int RANDOM = rng(); // random 32-bit int
ll operator()(ll x) const { /// https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
return __builtin_bswap64((x^RANDOM)*C); }
};
unordered_map<int,int,chash> U;
```
(explain assumptions that are required for this to work)
## Problems
<problems-list problems={metadata.problems.standard} />