--- id: unordered title: Unordered Maps & Sets author: Darren Yao description: "?" frequency: 2 --- import { Problem } from "../models"; export const metadata = { problems: { ex: [ new Problem("YS", "Associative Array", "associative_array", "Easy"), ], standard: [ new Problem("CSES", "Distinct Numbers", "1621", "Easy"), new Problem("CSES", "Sum of Two Values", "1640", "Easy", false, [], "Can be solved without sets."), new Problem("Silver", "Cities & States", "667", "Hard", false, [], ""), ], } }; ## Sets and Maps A **set** is a collection of objects that contains no duplicates. A **map** is a set of ordered pairs, each containing a key and a value. In a map, all keys are required to be unique, but values can be repeated. Maps have three primary methods: one to add a specified key-value pairing, one to retrieve the value for a given key, and one to remove a key-value pairing from the map. Like sets, maps can be unordered or ordered. Both Java and C++ contain two versions of sets and maps; one in which the keys are stored in sorted order, and one in which **hashing** is used. Bronze problems shouldn't distinguish between the two, so we'll cover only the latter in this module. ## Hashing **Hashing** refers to assigning a unique code to every variable/object which allows insertions, deletions, and searches in $O(1)$ time, albeit with a high constant factor, as hashing requires a large constant number of operations. However, as the name implies, elements are not ordered in any meaningful way, so traversals of an unordered set will return elements in some arbitrary order. (more in-depth explanation?) ## HashSets ### C++ The operations on an unordered set are `insert`, which adds an element to the set if not already present, `erase`, which deletes an element if it exists, and `count`, which returns `1` if the set contains the element and `0` if it doesn't. ```cpp unordered_set s; s.insert(1); // [1] s.insert(4); // [1, 4] in arbitrary order s.insert(2); // [1, 4, 2] in arbitrary order s.insert(1); // [1, 4, 2] in arbitrary order // the add method did nothing because 1 was already in the set cout << s.count(1) << endl; // 1 set.erase(1); // [2, 4] in arbitrary order cout << s.count(5) << endl; // 0 s.erase(0); // [2, 4] in arbitrary order // if the element to be removed does not exist, nothing happens for(int element : s){ cout << element << " "; } cout << endl; // You can iterate through an unordered set, but it will do so in arbitrary order ``` ### Java The operations on an unordered set are `add`, which adds an element to the set if not already present, `remove`, which deletes an element if it exists, and `contains`, which checks whether the set contains that element. ```java HashSet set = new HashSet(); set.add(1); // [1] set.add(4); // [1, 4] in arbitrary order set.add(2); // [1, 4, 2] in arbitrary order set.add(1); // [1, 4, 2] in arbitrary order // the add method did nothing because 1 was already in the set System.out.println(set.contains(1)); // true set.remove(1); // [2, 4] in arbitrary order System.out.println(set.contains(5)); // false set.remove(0); // [2, 4] in arbitrary order // if the element to be removed does not exist, nothing happens for(int element : set){ System.out.println(element); } // You can iterate through an unordered set, but it will do so in arbitrary order ``` ## HashMaps ### C++ In an unordered map `m`, the `m[key] = value` operator assigns a value to a key and places the key and value pair into the map. The operator `m[key]` returns the value associated with the key. If the key is not present in the map, then `m[key]` is set to 0. The `count(key)` method returns the number of times the key is in the map (which is either one or zero), and therefore checks whether a key exists in the map. Lastly, `erase(key)` and `erase(it)` removes the map entry associated with the specified key or iterator. All of these operations are $O(1)$, but again, due to the hashing, this has a high constant factor. ```cpp unordered_map m; m[1] = 5; // [(1, 5)] m[3] = 14; // [(1, 5); (3, 14)] m[2] = 7; // [(1, 5); (3, 14); (2, 7)] m.erase(2); // [(1, 5); (3, 14)] cout << m[1] << '\n'; // 5 cout << m.count(7) << '\n' ; // 0 cout << m.count(1) << '\n' ; // 1 ``` ### Java In the unordered map, the `put(key, value)` method assigns a value to a key and places the key and value pair into the map. The `get(key)` method returns the value associated with the key. The `containsKey(key)` method checks whether a key exists in the map. Lastly, `remove(key)` removes the map entry associated with the specified key. All of these operations are $O(1)$, but again, due to the hashing, this has a high constant factor. ```java HashMap map = new HashMap(); map.put(1, 5); // [(1, 5)] map.put(3, 14); // [(1, 5); (3, 14)] map.put(2, 7); // [(1, 5); (3, 14); (2, 7)] map.remove(2); // [(1, 5); (3, 14)] System.out.println(map.get(1)); // 5 System.out.println(map.containsKey(7)); // false System.out.println(map.containsKey(1)); // true ``` ## Hacking In USACO contests, unordered sets and maps generally fine, but the built-in hashing algorithm for C++ is vulnerable to pathological data sets causing abnormally slow runtimes. Apparently [Java](https://codeforces.com/blog/entry/62393?#comment-464875) is not vulnerable to this, however. How to create user-defined hash function for `unordered_map`. Explanation of this problem and how to fix it. ## Problems