--- id: intro-sorting title: "Introduction to Sorting" author: Siyong Huang, Michael Cao, Nathan Chen description: Introduces sorting and binary searching on a sorted array. frequency: 0 --- import { Problem } from "../models"; export const metadata = { problems: { bubble: [ new Problem("HR", "Bubble Sort", "https://www.hackerrank.com/challenges/ctci-bubble-sort/problem", "Easy", false, [], "O(N^2)"), new Problem("Silver", "Out of Sorts", "834", "Very Hard", false, []), ], count: [ new Problem("Silver", "Counting Haybales", "666", "Normal", false, []), ], cses: [ new Problem("CSES", "Apartments", "1084", "Normal", false, [], "Sort applicants and apartments, then greedily assign applicants"), new Problem("CSES", "Ferris Wheel", "1090", "Normal", false, [], "Sort children, keep a left pointer and a right pointer. Each gondola either is one child from the right pointer or two children, one left and one right."), new Problem("CSES", "Restaurant Customers", "1619", "Normal", false, [], ""), new Problem("CSES", "Stick Lengths", "1074", "Normal", false, [], "Spoiler: Optimal length is median"), ], } }; **Sorting** is exactly what it sounds like: arranging items in some particular order. No bronze problem should require sorting, but it can be an alternate solution that is sometimes much easier to implement. ## Additional Resources types of sorting, binary search, C++ code ## Sorting Algorithms (why are these important?) There are many sorting algorithms, here are some sources to learn about the popular ones: ### Tutorial - [HackerEarth Quicksort](https://www.hackerearth.com/practice/algorithms/sorting/quick-sort/tutorial/) - expected $O(N\log N)$ - [HackerEarth Mergesort](https://www.hackerearth.com/practice/algorithms/sorting/merge-sort/tutorial/) - $O(N\log N)$ ## Library Functions - Sorting - C++ - [std::sort](https://en.cppreference.com/w/cpp/algorithm/sort) - [std::stable\_sort](http://www.cplusplus.com/reference/algorithm/stable_sort/) - [Golovanov399 - C++ Tricks](https://codeforces.com/blog/entry/74684) - first two related to sorting - Java - [Arrays.sort](https://docs.oracle.com/javase/7/docs/api/java/util/Arrays.html#sort(java.lang.Object[])) - [Collections.sort](https://docs.oracle.com/javase/7/docs/api/java/util/Collections.html#sort(java.util.List)) - Python - [Sorting Basics](https://docs.python.org/3/howto/sorting.html) The `Arrays.sort()` function uses quicksort on primitive data types such as `long`s. This is fine for USACO, but in other contests such as CodeForces, it may time out on test cases specifically engineered to trigger worst-case $O(N^2)$ behavior in quicksort. See [here](https://codeforces.com/contest/1324/hacks/625031/) for an example of a solution that was hacked on CF. Two ways to avoid this: - Declare the underlying array as an array of objects, for example `Long` instead of `long`. This forces the `Arrays.sort()` function to use mergesort, which is always $O(N \log N)$. - [Shuffle](https://pastebin.com/k6gCRJDv) the array beforehand. ## Binary Search [Binary search](https://en.wikipedia.org/wiki/Binary_search_algorithm) can be used on monotonic (\*what's that?) functions for a logarithmic runtime. \*monotonic means *nondecreasing* or *nonincreasing* Here is a very basic form of binary search: > Find an element in a sorted array of size $N$ in $O(\log N)$ time. Other variations are similar, such as the following: > Given $K$, find the largest element less than $K$ in a sorted array. ### Tutorial animations! animation! ### Library Functions - Binary Search #### C++ - [lower_bound](http://www.cplusplus.com/reference/algorithm/lower_bound/) - [upper_bound](http://www.cplusplus.com/reference/algorithm/upper_bound/) #### Java - [Arrays.binarySearch](https://docs.oracle.com/javase/7/docs/api/java/util/Arrays.html) - [Collections.binarySearch](https://docs.oracle.com/javase/7/docs/api/java/util/Collections.html) ## Coordinate Compression Another useful application of sorting is **coordinate compression**, which takes some points and reassigns them to remove wasted space. > Farmer John has just arranged his $N$ haybales $(1\le N \le 100,000)$ at various points along the one-dimensional road running across his farm. To make sure they are spaced out appropriately, please help him answer $Q$ queries ($1 \le Q \le 100,000$), each asking for the number of haybales within a specific interval along the road. However, each of the points are in the range $0 \ldots 1,000,000,000$, meaning you can't store locations of haybales in, for instance, a boolean array. ### Solution Let's place all of the locations of the haybales into a list and sort it. (fix part below so transform to range $1\ldots N$) Now, we can map distinct points to smaller integers without gaps. For example, if the haybales existed at positions $[1, 4, 5, 9]$ and queries were $(1, 2)$ and $(4, 6)$, we can place the integers together and map them from $[1, 2, 4, 5, 6, 9] \rightarrow [1, 2, 3, 4, 5, 6]$. This effectively transforms the haybale positions into $[1, 3, 4, 6]$ and the queries into $1, 2$ and $3, 5$. By compressing queries and haybale positions, we've transformed the range of points to $0 \ldots N + 2Q$, allowing us to store prefix sums to effectively query for the number of haybales in a range. ## Problems