This repository has been archived on 2022-06-22. You can view files and clone it, but cannot push or open issues or pull requests.
usaco-guide/content/4_Silver/Sorting_Old.mdx
2020-07-06 20:36:30 -04:00

449 lines
15 KiB
Text

---
id: sorting-old
title: "Sorting with Custom Comparators Pt 1"
author: Darren Yao, Siyong Huang, Michael Cao, Benjamin Qi
prerequisites:
- Bronze - Pairs & Tuples in C++
- Bronze - Introduction to Data Structures
description: Both Java and C++ have built-in functions for sorting. However, if we use custom objects, or if we want to sort elements in a different order, then we'll need to use a custom comparator.
---
import { Problem } from "../models";
export const metadata = {
problems: {
sample: [
new Problem("Silver", "Wormhole Sort", "992", "Normal", false, [], ""),
],
general: [
new Problem("CSES", "Restaurant Customers", "1619", "Easy", false, [], "sort endpoints of intervals"),
new Problem("Silver", "Lifeguards", "786", "Easy", false, [], "sort endpoints of intervals"),
new Problem("Silver", "Rental Service", "787", "Easy", false, [], ""),
new Problem("Silver", "Mountains", "896", "Easy", false, [], ""),
new Problem("Silver", "Mooyo Mooyo", "860", "Easy", false, [], "Not a sorting problem, but you can use sorting to simulate gravity. - Write a custom comparator which puts zeroes at the front and use `stable_sort` to keep the relative order of other elements the same."),
new Problem("Silver", "Meetings", "967", "Very Hard", false, [], ""),
],
}
};
## Example: Wormhole Sort
<problems-list problems={metadata.problems.sample} />
There are multiple ways to solve this problem. We won't discuss the full solution here, but all of them start by sorting the edges in nondecreasing order of weight.
With C++, the easiest method is to use nested pairs.
```cpp
#include <bits/stdc++.h>
using namespace std;
#define f first
#define s second
int main() {
int M = 4;
vector<pair<int,pair<int,int>>> v;
for (int i = 0; i < M; ++i) {
int a,b,w; cin >> a >> b >> w;
v.push_back({w,{a,b}});
}
sort(begin(v),end(v));
for (auto e: v) cout << e.s.f << " " << e.s.s << " " << e.f << "\n";
}
```
But what if the built-in comparison function for pairs didn't exist?
- If we only stored the edge weights and sorted them, we would have a sorted list of edge weights, but it would be impossible to tell which weights corresponded to which edges.
- However, if we create a **class** (or struct) representing the edges and define a **custom comparator** to sort them by weight, we can sort the edges in ascending order while also keeping track of their endpoints.
## Classes
First, we need to define a **class** that represents what we want to sort (or a [`struct`](http://www.cplusplus.com/doc/tutorial/structures/) in C++, which is the same as a `class` in C++ but all members are public by default).
In our example we will define a class `Person` that contains a person's height and weight, and sort in ascending order by height.
```cpp
struct Person {
int height, weight;
Person (int h, int w) { height = h; weight = w; }
};
int main() {
Person p;
p.height = 60; // assigns 60 to the height of p
p.weight = 100; // assigns 100 to the weight of p
}
```
```java
static class Person {
int height, weight;
public Person (int h, int w) { height = h; weight = w; }
}
```
## Comparators
Normally, sorting functions rely on moving objects with a lower value in front of objects with a higher value if sorting in ascending order, and vice versa if in descending order. This is done through comparing two objects at a time.
### C++
<resources>
<resource source="CPH" title="3.2 - User-Defined Structs, Comparison Functions" starred></resource>
</resources>
What a comparator does is compare two objects as follows, based on our comparison criteria:
- If object $x$ is less than object $y$, return `true`
- If object $x$ is greater than or equal to object $y$, return `false`
Essentially, the comparator determines whether object $x$ belongs to the left of object $y$ in a sorted ordering. A comparator **must** return false for two identical objects (not doing so results in undefined behavior and potentially a runtime error).
In addition to returning the correct answer, comparators should also satisfy the following conditions:
- The function must be consistent with respect to reversing the order of the arguments: if $x \neq y$ and `compare(x, y)`is `true`, then `compare(y, x)` should be `false` and vice versa.
- The function must be transitive. If `compare(x, y)` is true and `compare(y, z)` is true, then `compare(x, z)` should also be true. If the first two compare functions both return `false`, the third must also return `false`.
### Java
What a `Comparator` does is compare two objects as follows, based on our comparison criteria:
- If object $x$ is less than object $y$, return a negative number.
- If object $x$ is greater than object $y$, return a positive number.
- If object $x$ is equal to object $y$, return 0.
In addition to returning the correct number, comparators should also satisfy the following conditions:
- The function must be consistent with respect to reversing the order of the arguments: if `compare(x, y)` is positive, then `compare(y, x)` should be negative and vice versa.
- The function must be transitive. If `compare(x, y) > 0` and `compare(y, z) > 0`, then `compare(x, z) > 0`. Same applies if the compare functions return negative numbers.
- Equality must be consistent. If `compare(x, y) = 0`, then `compare(x, z)` and `compare(y, z)` must both be positive, both negative, or both zero. Note that they don't have to be equal, they just need to have the same sign.
Java has default functions for comparing `int`, `long`, `double` types. The `Integer.compare()`, `Long.compare()`, and `Double.compare()` functions take two arguments $x$ and $y$ and compare them as described above.
## C++
<resources>
<resource source="fushar" title="Comparison Functions in C++" url="http://fusharblog.com/3-ways-to-define-comparison-functions-in-cpp/"></resource>
</resources>
### Method 1: Operator <
[StackOverflow: Why const T&?](https://stackoverflow.com/questions/11805322/why-should-i-use-const-t-instead-of-const-t-or-t)
- Pro:
- This is the easiest to implement
- Easy to work with STL
- Con:
- Only works for objects (not primitives)
- Only supports two types of comparisons (less than (<) and greater than (>))
<!-- Tested -->
```cpp
#include <bits/stdc++.h>
using namespace std;
int randint(int low, int high) {return low+rand()%(high-low);}
struct Foo
{
int Bar;
Foo(int _Bar=-1):Bar(_Bar){}
bool operator < (const Foo& foo2) const {return Bar < foo2.Bar;}
};
const int N = 8;
int main()
{
srand(69);
Foo a[N];
for(int i=0;i<N;++i) a[i] = Foo(randint(0, 20));
for(int i=0;i<N;++i) printf("(Foo: %2d) ", a[i].Bar); printf("\n");
sort(a, a+N);
for(int i=0;i<N;++i) printf("(Foo: %2d) ", a[i].Bar); printf("\n");
}
/**
Output:
(Foo: 3) (Foo: 18) (Foo: 13) (Foo: 5) (Foo: 18) (Foo: 3) (Foo: 15) (Foo: 0)
(Foo: 0) (Foo: 3) (Foo: 3) (Foo: 5) (Foo: 13) (Foo: 15) (Foo: 18) (Foo: 18)
*/
```
In the context of Wormhole Sort (uses [friend](https://www.geeksforgeeks.org/friend-class-function-cpp/)):
```cpp
#include <bits/stdc++.h>
using namespace std;
struct Edge {
int a,b,w;
friend bool operator<(const Edge& x, const Edge& y) { return x.w < y.w; }
}; // a different way to write less than
int main() {
int M = 4;
vector<Edge> v;
for (int i = 0; i < M; ++i) {
int a,b,w; cin >> a >> b >> w;
v.push_back({a,b,w});
}
sort(begin(v),end(v));
for (Edge e: v) cout << e.a << " " << e.b << " " << e.w << "\n";
}
/*
Input:
1 2 9
1 3 7
2 3 10
2 4 3
*/
/*
Output:
2 4 3
1 3 7
1 2 9
2 3 10
*/
```
### Method 2: Function Outside Class
Let's say we have an array `Person arr[N]`. To sort the array, we need to make custom comparator which will be a function, and then pass the function as a parameter into the build-in sort function:
```cpp
bool cmp(Person a, Person b) {
return a.height < b.height;
}
int main() {
sort(arr, arr+N, cmp); // sorts the array in ascending order by height
}
```
If we instead wanted to sort in descending order, this is also very simple. Instead of the `cmp` function returning `return a.height < b.height;`, it should do `return a.height > b.height;`.
- Pro:
- Works for both objects and primitives
- Supports many different comparators for the same object
- Con:
- More difficult to implement
- Extra care needs to be taken to support STL
We can also use [lambda expressions](https://www.geeksforgeeks.org/lambda-expression-in-c/) in C++11 or above.
<!-- Tested -->
```cpp
#include <bits/stdc++.h>
using namespace std;
int randint(int low, int high) {return low+rand()%(high-low);}
struct Foo
{
int Bar;
Foo(int _Bar=-1):Bar(_Bar){}
};
const int N = 8;
Foo a[N];
bool cmp1(Foo foo1, Foo foo2) {return foo1.Bar < foo2.Bar;}
function<bool(Foo,Foo)> cmp2 = [](Foo foo1, Foo foo2) {return foo1.Bar < foo2.Bar;}; // lambda expression
// bool(Foo,Foo) means that the function takes in two parameters of type Foo and returns bool
// "function<bool(Foo,Foo)>"" can be replaced with "auto"
int main()
{
srand(69);
printf("--- Method 1 ---\n");
for(int i=0;i<N;++i) a[i] = Foo(randint(0, 20));
for(int i=0;i<N;++i) printf("(Foo: %2d) ", a[i].Bar); printf("\n");
sort(a, a+N, [](Foo foo1, Foo foo2){return foo1.Bar<foo2.Bar;}); // lambda expression
for(int i=0;i<N;++i) printf("(Foo: %2d) ", a[i].Bar); printf("\n");
printf("--- Method 2 ---\n");
for(int i=0;i<N;++i) a[i] = Foo(randint(0, 20));
for(int i=0;i<N;++i) printf("(Foo: %2d) ", a[i].Bar); printf("\n");
sort(a, a+N, cmp1);
for(int i=0;i<N;++i) printf("(Foo: %2d) ", a[i].Bar); printf("\n");
printf("--- Method 3 ---\n");
for(int i=0;i<N;++i) a[i] = Foo(randint(0, 20));
for(int i=0;i<N;++i) printf("(Foo: %2d) ", a[i].Bar); printf("\n");
sort(a, a+N, cmp2);
for(int i=0;i<N;++i) printf("(Foo: %2d) ", a[i].Bar); printf("\n");
}
```
Output:
```
--- Method 1 ---
(Foo: 3) (Foo: 18) (Foo: 13) (Foo: 5) (Foo: 18) (Foo: 3) (Foo: 15) (Foo: 0)
(Foo: 0) (Foo: 3) (Foo: 3) (Foo: 5) (Foo: 13) (Foo: 15) (Foo: 18) (Foo: 18)
--- Method 2 ---
(Foo: 5) (Foo: 13) (Foo: 18) (Foo: 0) (Foo: 9) (Foo: 4) (Foo: 2) (Foo: 15)
(Foo: 0) (Foo: 2) (Foo: 4) (Foo: 5) (Foo: 9) (Foo: 13) (Foo: 15) (Foo: 18)
--- Method 3 ---
(Foo: 1) (Foo: 1) (Foo: 18) (Foo: 0) (Foo: 11) (Foo: 12) (Foo: 1) (Foo: 5)
(Foo: 0) (Foo: 1) (Foo: 1) (Foo: 1) (Foo: 5) (Foo: 11) (Foo: 12) (Foo: 18)
```
In the context of Wormhole Sort:
```cpp
#include <bits/stdc++.h>
using namespace std;
struct Edge { int a,b,w; };
int main() {
int M = 4;
vector<Edge> v;
for (int i = 0; i < M; ++i) {
int a,b,w; cin >> a >> b >> w;
v.push_back({a,b,w});
}
sort(begin(v),end(v),[](const Edge& x, const Edge& y) { return x.w < y.w; });
for (Edge e: v) cout << e.a << " " << e.b << " " << e.w << "\n";
}
```
## Java
Now, there are two ways of implementing this in Java: `Comparable`, and `Comparator`. They essentially serve the same purpose, but `Comparable` is generally easier and shorter to code. `Comparable` is a function implemented within the class containing the custom object, while `Comparator` is its own class. For our example, we'll use a `Person` class that contains a person's height and weight, and sort in ascending order by height.
### Comparable
If we use `Comparable`, we'll need to put `implements Comparable<Person>` into the heading of the class. Furthermore, we'll need to implement the `compareTo` method. Essentially, `compareTo(x)` is the `compare` function that we described above, with the object itself as the first argument, or `compare(self, x)`.
```java
static class Person implements Comparable<Person>{
int height, weight;
public Person(int h, int w){
height = h; weight = w;
}
public int compareTo(Person p){
return Integer.compare(height, p.height);
}
}
```
When using Comparable, we can just call `Arrays.sort(arr)` or `Collections.sort(list)` on the array or list as usual.
### Comparator
If instead we choose to use `Comparator`, we'll need to declare a second `Comparator` class, and then implement that:
```java
static class Person{
int height, weight;
public Person(int h, int w){
height = h; weight = w;
}
}
static class Comp implements Comparator<Person>{
public int compare(Person a, Person b){
return Integer.compare(a.height, b.height);
}
}
```
When using `Comparator`, the syntax for using the built-in sorting function requires a second argument: `Arrays.sort(arr, new Comp())`, or `Collections.sort(list, new Comp())`.
If we instead wanted to sort in descending order, this is also very simple. Instead of the comparing function returning `Integer.compare(x, y)` of the arguments, it should instead return `-Integer.compare(x, y)`.
## Python
### Defining Operator
<!-- Tested -->
```py
import random
class Foo:
def __init__(self, _Bar): self.Bar = _Bar
def __str__(self): return "Foo({})".format(self.Bar)
def __lt__(self, o): # lt means less than
return self.Bar < o.Bar
a = []
for i in range(8):
a.append(Foo(random.randint(1, 10)))
print(*a)
print(*sorted(a))
```
Output:
```
Foo(0) Foo(1) Foo(2) Foo(1) Foo(9) Foo(5) Foo(5) Foo(8)
Foo(0) Foo(1) Foo(1) Foo(2) Foo(5) Foo(5) Foo(8) Foo(9)
```
### Remapping Key
- This method maps an object to another comparable datatype with which to be sorted. In this case, `Foo` is sorted by the sum of its members `x` and `y`.
<!-- Tested -->
```py
import random
class Foo:
def __init__(self, _Bar, _Baz): self.Bar,self.Baz = _Bar,_Baz
def __str__(self): return "Foo({},{})".format(self.Bar, self.Baz)
a = []
for i in range(8):
a.append(Foo(random.randint(1, 9)*10, random.randint(1, 9)))
print(*a)
print(*sorted(a, key=lambda foo: foo.Bar+foo.Baz))
def key(foo):
return foo.Bar + foo.Baz
print(*sorted(a, key=key))
```
Output:
```
Foo(10,2) Foo(30,2) Foo(60,6) Foo(90,7) Foo(80,7) Foo(80,9) Foo(60,9) Foo(90,8)
Foo(10,2) Foo(30,2) Foo(60,6) Foo(60,9) Foo(80,7) Foo(80,9) Foo(90,7) Foo(90,8)
Foo(10,2) Foo(30,2) Foo(60,6) Foo(60,9) Foo(80,7) Foo(80,9) Foo(90,7) Foo(90,8)
```
#### Function / Lambda
- This method defines how to compare two elements represented by an integer
- Positive: First term is greater than the second term
- Zero: First term and second term are equal
- Negative: First term is less than the second term
Note how the comparator must be converted to a `key`.
<!-- Tested -->
```py
import random
from functools import cmp_to_key
class Foo:
def __init__(self, _Bar): self.Bar = _Bar
def __str__(self): return "Foo({})".format(self.Bar)
a = []
for i in range(8):
a.append(Foo(random.randint(0, 9)))
print(*a)
print(*sorted(a, key=cmp_to_key(lambda foo1, foo2: foo1.Bar - foo2.Bar)))
def cmp(foo1, foo2):
return foo1.Bar - foo2.Bar
print(*sorted(a, key=cmp_to_key(cmp)))
```
Output:
```
Foo(0) Foo(1) Foo(2) Foo(1) Foo(9) Foo(5) Foo(5) Foo(8)
Foo(0) Foo(1) Foo(1) Foo(2) Foo(5) Foo(5) Foo(8) Foo(9)
Foo(0) Foo(1) Foo(1) Foo(2) Foo(5) Foo(5) Foo(8) Foo(9)
```