This repository has been archived on 2022-06-22. You can view files and clone it, but cannot push or open issues or pull requests.
usaco-guide/content/6_Plat/Bitsets.mdx

359 lines
12 KiB
Text

---
id: bitsets
title: "Bitsets"
author: Benjamin Qi
description: "Several examples of how bitsets give some unintended solutions on recent USACO problems."
prerequisites:
- Errichto - Bitwise Operations Pt 1
frequency: 2
---
import { Problem } from "../models";
export const metadata = {
problems: {
school: [
new Problem("CSES", "School Excursion", "1706", "Easy", false, ["Knapsack", "Bitset"], ""),
],
cow: [
new Problem("Gold", "Cowpatibility", "862", "Normal", false, ["PIE", "Bitset"], ""),
],
lots: [
new Problem("Plat", "Lots of Triangles", "672", "Normal", false, ["Geometry", "Bitset"], ""),
],
bfs: [
new Problem("CSA", "Substring Restrictions", "substring-restrictions", "Hard", false, ["DSU"], "")
],
ad: [
new Problem("Plat", "Equilateral Triangles", "1021", "Normal", false, ["Bitset, Sliding Window"], "Again, the intended solution runs in $O(N^3)$. Of course, it is still possible to pass $O(N^4)$ solutions with bitset! See the analysis [here](http://www.usaco.org/current/data/sol_triangles_platinum_feb20.html)."),
new Problem("CSES", "BOI Nautilus", "https://cses.fi/247/submit/B", "Normal", false, ["Bitset"], ""),
],
}
};
## Tutorial
tl;dr some operations are 32x-64x faster compared to a boolean array. See the [C++ Reference](http://www.cplusplus.com/reference/bitset/bitset/) for the operations you can perform.
<resources>
<resource source="CF" title="Errichto - Bitwise Operations Pt 2" url="blog/entry/73558"></resource>
</resources>
## Knapsack
<problems-list problems={metadata.problems.school} />
Of course, the first step is to generate the sizes of each connected component.
<spoiler title="Input">
```cpp
#include <bits/stdc++.h>
using namespace std;
struct DSU {
vector<int> e; void init(int N) { e = vector<int>(N,-1); }
int get(int x) { return e[x] < 0 ? x : e[x] = get(e[x]); }
bool sameSet(int a, int b) { return get(a) == get(b); }
int size(int x) { return -e[get(x)]; }
bool unite(int x, int y) { // union by size
x = get(x), y = get(y); if (x == y) return 0;
if (e[x] > e[y]) swap(x,y);
e[x] += e[y]; e[y] = x; return 1;
}
};
DSU D;
int n,m;
vector<int> comps;
void init() {
cin >> n >> m; D.init(n);
for (int i = 0; i < m; ++i) {
int a,b; cin >> a >> b;
D.unite(a-1,b-1);
}
for (int i = 0; i < n; ++i) if (D.get(i) == i)
comps.push_back(D.size(i));
}
```
</spoiler>
A naive knapsack solution would be as follows. For each $0\le i\le \texttt{comps.size()}$, let $\texttt{dp}[i][j]=1$ if there exists a subset of the first $i$ components whose sizes sum to $j$. Then the answer will be stored in $\texttt{dp}[i]$. This runs in $O(N^2)$ and is too slow if implemented naively, but we can use bitset to speed it up!
Note: you can't store all $N$ bitsets in memory at the same time (more on that below).
<spoiler title="Full Solution">
```cpp
int main() {
init();
bitset<100001> posi; posi[0] = 1;
for (int t: comps) posi |= posi<<t;
for (int i = 1; i <= n; ++i) cout << posi[i];
cout << "\n";
}
```
</spoiler>
**Challenge**: This solution runs in $\approx 0.3\text{s}$ when $N=10^5$ and there are no edges. Find a faster solution which can also be sped up with bitset (my solution runs in 0.03s).
## Cowpatibility (Gold)
<problems-list problems={metadata.problems.cow} />
Label the cows from $0\ldots N-1$. For two cows $x$ and $y$ set `adj[x][y]=1` if they share a common flavor. Then the number of pairs of cows that are compatible (counting each pair where $x$ and $y$ are distinct twice) is equal to the sum of `adj[x].count()` over all $x$. It remains to compute `adj[x]` for all $x$.
Unfortunately, storing $N$ bitsets each with $N$ bits takes up $\frac{50000^2}{32}\cdot 4=312.5\cdot 10^6$ bytes of memory, which is greater than USACO's $256$ megabyte limit. We can reduce the memory usage by half in exchange for a slight increase in time by first computing the adjacency bitsets for all $x\in [0,N/2)$, and then for all $x\in [N/2,N)$ afterwards.
First, we read in all of the flavors.
<spoiler title="Input">
```cpp
#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
typedef bitset<50000> B;
const int HALF = 25000;
int N;
B adj[HALF];
vector<int> flav[1000001];
ll ans;
void input() {
ios_base::sync_with_stdio(0); cin.tie(0);
freopen("cowpatibility.in","r",stdin);
freopen("cowpatibility.out","w",stdout);
cin >> N;
for (int i = 0; i < N; ++i)
for (int j = 0; j < 5; ++j) {
int x; cin >> x;
flav[x].push_back(i);
}
}
```
</spoiler>
Then for each flavor, we can look at all pairs of cows that share that flavor and update the adjacency lists for those $x\in [0,HALF)$.
```cpp
int main() {
input();
for (int i = 1; i <= 1000000; ++i)
for (int x: flav[i]) if (x < HALF) for (int y: flav[i]) adj[x][y] = 1;
for (int i = 0; i < HALF; ++i) ans += adj[i].count();
}
```
`adj[i].count()` runs quickly enough since its runtime is divided by the bitset constant. However, looping over all cows in `flav[i]` is too slow if say, `flav[i]` contains all cows. Then the nested loop could take $\Theta(N^2)$ time! Of course, we can instead write the nested loop in a way that takes advantage of fast bitset operations once again.
```cpp
for (int i = 1; i <= 1000000; ++i) if (flav[i].size() > 0) {
B b; for (int x: flav[i]) b[x] = 1;
for (int x: flav[i]) if (x < HALF) adj[x] |= b;
}
```
The full main function is as follows:
<spoiler title="Full Solution">
```cpp
int main() {
input();
for (int i = 1; i <= 1000000; ++i) if (flav[i].size() > 0) {
B b; for (int x: flav[i]) b[x] = 1;
for (int x: flav[i]) if (x < HALF) adj[x] |= b;
}
for (int i = 0; i < HALF; ++i) ans += adj[i].count();
for (int i = 0; i < HALF; ++i) adj[i].reset();
for (int i = 1; i <= 1000000; ++i) if (flav[i].size() > 0) {
B b; for (int x: flav[i]) b[x] = 1;
for (int x: flav[i]) if (x >= HALF) adj[x-HALF] |= b;
}
for (int i = 0; i < HALF; ++i) ans += adj[i].count();
cout << ((ll)N*N-ans)/2 << "\n";
}
```
</spoiler>
Apparently no test case contains more than $25000$ distinct colors, so we don't actually need to split the calculation into two halves.
## Lots of Triangles
<problems-list problems={metadata.problems.lots} />
First, we read in the input data. `cross(a,b,c)` is positive iff `c` lies to the left of the line from `a` to `b`.
<spoiler title="Input">
```cpp
#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
typedef pair<ll,ll> P;
#define f first
#define s second
ll cross(P a, P b, P c) {
b.f -= a.f, b.s -= a.s;
c.f -= a.f, c.s -= a.s;
return b.f*c.s-b.s*c.f;
}
vector<P> v;
int N;
void input() {
ios_base::sync_with_stdio(0); cin.tie(0);
freopen("triangles.in","r",stdin);
freopen("triangles.out","w",stdout);
cin >> N; v.resize(N);
for (P& p: v) cin >> p.f >> p.s;
}
```
</spoiler>
There are $O(N^3)$ possible lots. Trying all possible lots and counting the number of trees that lie within each in $O(N)$ for a total time complexity of $O(N^4)$ should solve somewhere between 2 and 5 test cases. Given a triangle `t[0], t[1], t[2]` with positive area, tree `x` lies within it iff `x` is to the left of each of sides `(t[0],t[1])`,` (t[1],t[2])`, and `(t[2],t[0])`.
<spoiler title="Slow Solution">
```cpp
int main() {
input();
vector<int> res(N-2);
for (int i = 0; i < N; ++i)
for (int j = i+1; j < N; ++j)
for (int k = j+1; k < N; ++k) {
vector<int> t = {i,j,k};
if (cross(v[t[0]],v[t[1]],v[t[2]]) < 0) swap(t[1],t[2]);
int cnt = 0;
for (int x = 0; x < N; ++x) {
if (cross(v[t[0]],v[t[1]],v[x]) <= 0) continue;
if (cross(v[t[1]],v[t[2]],v[x]) <= 0) continue;
if (cross(v[t[2]],v[t[0]],v[x]) <= 0) continue;
cnt ++;
}
res[cnt] ++;
}
for (int i = 0; i < N-2; ++i) cout << res[i] << "\n";
}
```
</spoiler>
The analysis describes how to count the number of trees within a lot in $O(1)$, which is sufficient to solve the problem. However, $O(N)$ is actually sufficient as long as we divide by the bitset constant. Let `b[i][j][k]=1` if `k` lies to the left of side `(i,j)`. Then `x` lies within triangle `(t[0],t[1],t[2])` as long as `b[t[0]][t[1]][x]=b[t[1]][t[2]][x]=b[t[2]][t[0]][x]=1`. We can count the number of `x` such that this holds true by taking the bitwise AND of the bitsets for all three sides and then counting the number of bits in the result.
<spoiler title="Fast Solution">
```cpp
bitset<300> b[300][300];
int main() {
input();
for (int i = 0; i < N; ++i)
for (int j = 0; j < N; ++j) if (j != i)
for (int k = 0; k < N; ++k) if (cross(v[i],v[j],v[k]) > 0)
b[i][j][k] = 1;
vector<int> res(N-2);
for (int i = 0; i < N; ++i)
for (int j = i+1; j < N; ++j)
for (int k = j+1; k < N; ++k) {
vector<int> t = {i,j,k};
if (cross(v[t[0]],v[t[1]],v[t[2]]) < 0) swap(t[1],t[2]);
auto z = b[t[0]][t[1]]&b[t[1]][t[2]]&b[t[2]][t[0]];
res[z.count()] ++;
}
for (int i = 0; i < N-2; ++i) cout << res[i] << "\n";
}
```
</spoiler>
## Knapsack Again (GP of Bytedance 2020 F)
> Given $n$ ($n\le 2\cdot 10^4$) positive integers $a_1,\ldots,a_n$ ($a_i\le 2\cdot 10^4$), find the max possible sum of a subset of $a_1,\ldots,a_n$ whose sum does not exceed $c$.
Consider the case when $\sum a_i\ge c$. The intended solution runs in $O(n\cdot \max(a_i))$; see [here](https://github.com/bqi343/USACO/blob/master/Implementations/content/various/Knapsack.h) for more information. However, we'll solve it with bitset instead.
As with the first problem in this module, let $\texttt{dp}[i][j]=1$ if there exists a subset of the first numbers components that sums to $j$. This solution runs in $O(n\cdot \sum a_i)$ time, which is too slow even if we use bitset.
Taking inspiration from [this](https://codeforces.com/blog/entry/67664) CF blog post, we'll first shuffle the integers randomly and perform the DP with the following modification:
- If $\left|\frac{ci}{n}-j\right| \ge X$ for some $X$ that we choose, then set $\texttt{dp}[i][j]=0$.
Since we only need to keep track of $2X+1$ values for each $i$, this solution runs in $O(nX)$ time, which runs in time with $X=5\cdot 10^5$ using bitset.
Intuitively, the random shuffle reduces the optimal subset to some random walk which should have variance at most $\max a_i\cdot \sqrt N$, so it suffices to take $X\approx \max a_i\cdot \sqrt N$. (Though I'm not completely convinced that this works, does anyone know how to bound the failure probability of this algorithm precisely?)
<spoiler title="Solution">
```cpp
#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
int n,c;
const int Z = 1000000;
mt19937 rng;
int solve() {
cin >> n >> c;
vector<int> a(n); int sum = 0;
for (int& x: a) {
cin >> x;
sum += x;
}
if (sum <= c) return sum;
shuffle(begin(a),end(a),rng);
bitset<Z> B; B[Z/2] = 1;
ll lst = 0;
for (int i = 0; i < n; ++i) {
ll cur = (ll)(i+1)*c/n;
int dif = cur-lst; lst = cur;
auto tmp = B>>dif;
ll wut = a[i]-dif;
if (wut >= 0) B = tmp|(B<<wut);
else B = tmp|(B>>(-wut));
}
for (int i = Z/2; i >= 0; --i) if (B[i] == 1) return c-(Z/2-i);
return 0;
}
int main() {
int T; cin >> T;
for (int i = 0; i < T; ++i) cout << solve() << "\n";
}
```
</spoiler>
## Other Applications
Use to speed up the following:
- Gaussian Elimination in $O(N^3)$
- Bipartite matching in $O(N^3)$
- BFS in $O(N^2)$
Operations such as `_Find_first()` and `_Find_next()` mentioned in Errichto's blog are helpful. (are these documented?)
Regarding the last application:
<problems-list problems={metadata.problems.bfs} />
In USACO Camp, this problem appeared with $N\le 10^5$ and a large time limit ...
## Additional Problems
<problems-list problems={metadata.problems.ad} />