Post

CCO 2020

CCO 2020

My Experience

Note: I am writing this blog in June 2021, after the 2021 (not 2020) CCO.

2020 was my first time at the Canadian Computing Olympiad (CCO). The top 25 or so high school students from the (open) Canadian Computing Contest Senior Division (CCC Senior) are invited to the CCO, which serves as a Team Selection Test (TST) for the Canadian International Olympiad in Informatics (IOI) team of four. It normally runs for a week in May at the University of Waterloo. It was held online this year, which meant that I missed out on a lot of fun.

I had come out of the CCC with a nice lead of 15 points over many contestants. I believe that I had gotten lucky by not overestimating the difficulty of the last problem on the CCC. I did not expect to make the IOI team since there were so many contestants better than me. If I recall correctly, I expected to place around 10th.

My mock contest sessions had taught me how challenging and mentally draining the CCO would be. I did not expect the two contest days to be pleasant. In fact, I was surprised when the day 1 contest turned out to be easier than I expected. After day 1, I saw that I was tied with Andrew Dong at 4th place (scoreboard). I couldn’t believe that I had a real chance at making the IOI team!

This hope was lost on day 2 when I scored a meager 3/75. I spent most of my time grinding data structures on P5 and did nothing to get any partial marks. This was a big mistake, seeing that Andrew Dong tied for 4th place by getting a few partial marks on every problem. On day 2, I suffered from the phenomenon of “tunnel vision,” which meant that I was so fixated on one idea that I ignored everything else. In addition, my contest preparation consisted of mostly Codeforces rounds and not enough OI (informatics olympiad) material. This problem-solving focus worked well for a simpler contest like the CCC, but not for the CCO.

I ended up in 9th place, which I was happy with. I did not regret missing an opportunity to qualify for the IOI team since I had no such expectation in the first place. Plus, I still had one more opportunity the following year. I took a one-week break from programming following the CCO. Afterwards, I practised a lot of implementation-heavy OI problems. I was already getting ready for next year’s contest.

Solutions

Here’s what you have all been waiting for. My code is available here.

Day 1 Problem 1 - A Game with Grundy

Find each friend’s vision’s intersections with the line $y = Y$. If the left intersection point is at $(x_1, Y)$ and the right one is at $(x_2, Y)$, we can add two update operations: +1 at $\lceil x_1 \rceil$ and -1 at $\lfloor x_2 \rfloor +1$. Finally, line sweep over the updates from left to right.

Time complexity: $\mathcal{O}(N \log N)$.

Number of solves1: 18/20.

Day 1 Problem 2 - Exercise Deadlines

The key to solving this problem is coming up with a greedy strategy to order the exercises. First, we see that on day $N$, we can only complete the exercises that are due on day $N$. If any such exercises exist, we can greedily take the rightmost one since it requires the least amount of swaps to move to day $N$. On day $N-1$, we can complete the exercises with deadlines of day $N-1$ or $N$. We can once again greedily take the rightmost such exercise. Our greedy algorithm is to loop the days from $N$ to $1$ and on each day, take the rightmost task that is due on or after the current day. To get the total number of swaps, also known as inversions, we will use a data structure to assist us. This is guaranteed to give the minimum number of swaps. I will prove by contradiction that this is the case.

On day $i$, our greedy solution takes the exercise at index $r$. Suppose that there is an exercise at index $l$ ($l < r$) which we should take instead to get a better answer. Let $j$ be the day on which index $l$ is taken in our greedy strategy. Now let $f(x, y) = |x-y|$, which represents the cost of moving the exercise currently at day $x$ to day $y$. We find that this function satisfies the quadrangle inequality, which is $f(l, j) + f(r, i) \le f(l, i) + f(r, j)$ given that $l \le r \le j \le i$ (we know that $r \le j$ since we can not take an exercise past its due date). The quadrangle inequality is enough to show that our algorithm is correct since it contradicts our assumption that there was a better option than our greedy algorithm.

If on any day $i$, there is no exercise we can complete, then we will need to complete the remaining $i$ exercises in $i-1$ days, which is impossible.

Time Complexity: $\mathcal{O}(N \log N)$.

Number of solves: 9/20.

Day 1 Problem 3 - Mountains and Valleys (Half Editorial)

I found this to be the hardest problem of the contest.

If there are no mountainous trails, then we are left with a tree and our answer is $2(N-1) - (\text{the diameter of the tree})$.

In subtask 3, where the mountainous paths weights are restricted to $w_i \ge \lceil \frac{N}{2} \rceil$, we can show that the optimal path does not take any mountainous edge. Taking a mountainous edge from $a$ to $b$ with distance $w$ improves our answer by at most $w - \text{distance}(a, b) \ge \left\lceil \frac{N}{2} \right\rceil - \text{distance}(a, b)$. We need $\text{distance}(a, b) \ge \left\lceil \frac{N}{2} \right\rceil$ for the mountainous path to be worth taking. If this were true, it means that the diameter is at least $\left\lceil \frac{N}{2} \right\rceil$, which in turn means that our original answer of $2(N-1) - (\text{the diameter of the tree})$ was at most $2(N-1) - \left\lceil \frac{N}{2} \right\rceil$. Our new answer is at least $N-2 + \left\lceil \frac{N}{2} \right\rceil$, since we need to take at least $N-1$ edges in order to visit $n$ nodes and one of those edges is a mountainous one. If the new answer is indeed better than the old one, then we should have:

\[\newcommand\floor[1]{\left\lfloor#1\right\rfloor} \newcommand\ceil[1]{\left\lceil#1\right\rceil} \begin{align*} N-2 + \ceil{\frac{N}{2}} &< 2(N-1) - \ceil{\frac{N}{2}} \\ N-2 + \ceil{\frac{N}{2}} &< 2N-2 - \left(N + \ceil{\frac{N}{2} - N} \right) \\ N-2 + \ceil{\frac{N}{2}} &< N-2 - \ceil{\frac{N}{2} - N} \\ \ceil{\frac{N}{2}} &< - \ceil{\frac{N}{2} - N} \\ \ceil{\frac{N}{2}} &< - \left( -\floor{\frac{N}{2}} \right) \\ \ceil{\frac{N}{2}} &< \floor{\frac{N}{2}} .\\ \end{align*}\]

This is impossible, which contradicts our assumption that there was a case where a mountainous edge was needed in the optimal path. Therefore the answer in subtask 3 is always $2(N-1) - (\text{the diameter of the tree})$.

To solve the full problem, we can use similar reasoning to prove that we will need to take at most one mountainous edge. The dynamic programming involved is very cancerous and I had to do a decent amount of constant optimization to get my solution to pass. My AC solution on DM::OJ is 287 lines long and passes right under the time limit. I wish you luck if you are attempting to solve this problem.

Time Complexity: $\mathcal{O}((N + M) \log N)$ with a large constant factor. The $\log N$ comes from binary lifting.

Number of solves: 1/20.

P.S. There also exists a (slower?) centroid decomposition solution that I do not understand at all.

Day 2 Problem 1 (Problem 4) - Travelling Salesperson

Here is a solution that visits exactly $2N$ buildings. We can arbitrarily choose a starting building and, for every building connected to it with a red road, travel there and back. Then, for every building that is connected to the starting building with a blue road, we also go there and back. How many points will this yield? We can submit this code to find that we get exactly 8 points, which means that $2N$ is twice the length of the optimal route and that the optimal route’s length is exactly $N$.

Consider a valid path, $b$, of length $k$ ($ b_0, b_1, …, b_{k-2}, b_{k-1}$). Without loss of generality, the path starts off with some red roads, then switches to using blue roads. When we want to add a new building $c$ to the path, we will try connecting it to either $b_0$ or $b_{k-1}$. If the road $(b_0, c)$ is red, then we can add $c$ to the start of the path. If the road ($b_{k-1}, c)$ is blue, then we can add $c$ to the end of the path. If neither of these are true, we consider the colour of road $(b_0, b_{k-1})$. If it is blue, then we can change our path into $b_1, …, b_{k-2}, b_{k-1}, b_0, c$. If it is red, then we can change our path to be $c, b_{k-1}, b_0, b_1, …, b_{k-2}$. We can keep doing this until we get a path of length $N$ containing every building exactly once.

Time Complexity: $\mathcal{O}(N^2)$ (the slowest part is reading the input).

Number of solves: 1/20.

Day 2 Problem 2 (Problem 5) - Interval Collection

Let $minr_i$ be equal to the leftmost right endpoint of a segment whose left endpoint is at $i$. If no such right endpoint exists, then $minr_i = \infty$. Likewise, let $maxl_i$ be equal to the rightmost left endpoint of a segment whose right endpoint is at $i$. If no such left endpoint exists, then $maxl_i = -\infty$. We will make a segment tree over each of these lists. Let $L(l, r)$ equal the maximum $maxl_i$ in the range $l \le i \le r$ and let $R(l, r)$ equal the minimum $minr_i$ in the range $l \le i \le r$.

We will be solving this problem online and updating the segment tree as we add or remove new segments. At each point in time, we will first try to take a set of segments that do not intersect. We can check if any such segments exist by querying the global $minr_g$ and $maxl_g$. When the minimum right endpoint is less than or equal to the highest left endpoint, there exists at least one way to take a set of non-overlapping/disjoint segments: any segment with its right endpoint at $minr_g$ and any segment with its left endpoint at $maxl_g$ will work (I do not yet claim that this is optimal).

How do we find the optimal set of disjoint segments to take? In a set of two disjoint segments ${[l_1, r_1], [l_2, r_2]}, r_1 \le l_2$, the length is $r_2-l_1$. Adding another segment will never decrease the length of the least enclosing interval, so we don’t need to consider sets with more than two segments. First we will try to solve the problem if we never need to remove segments. In this case, after adding a segment $[l, r]$, the only new possible sets we need to consider to make our answer better are the ones with least enclosing intervals of $[l, R(r, \infty)]$ and $[L(-\infty, l), r]$. We will update our current answer if either of these is better. Our answer will only get better as we add more segments. To extend our solution to allow for removal of segments, we can “combine” pairs of intervals as we merge segment tree nodes. At a leaf node corresponding to location $i$, our answer is $ans_\text{node} = minr_i - maxl_i$. At other segment tree nodes, the answer for the whole node is $ans_\text{node} = \min(ans_\text{left child}, ans_\text{right child}, minr_\text{right child} - maxl_\text{left child})$. Our final answer is $ans_\text{root node}$. This merging works since we are forcing left segments to end before right segments start.

When there are no sets of disjoint segments, we are told to minimize the greatest common interval. This common interval will be $[maxl_g, minr_g]$. This means that we take a segment with the least right endpoint and a segment with the highest left endpoint. These two intervals are $[maxl_{minr_g}, minr_g]$ and $[maxl_g, minr_{maxl_g}]$. Note that we only take one segment when these are the same because the greatest common interval is just the single segment.

Time Complexity: $\mathcal{O}(N \log N)$.

Number of solves: 1/20.

Day 2 Problem 3 (Problem 6) - Shopping Plans

This problem uses a technique called fracturing search. It’s very similar to using a modified Dijkstra’s algorithm to find the $K^\text{th}$ shortest path, but the problem with a naive implementation is that the number of states and transitions we need might be huge. We want an $\mathcal{O}(1)$ transition and $\mathcal{O}(K)$ states.

First let’s assume that there is only one type of item. We sort all the items in non-decreasing order (of cost) and add the first $x_0$ of them to our base cost. We will call this the base state: the one that costs the least, with a cost of our “base cost” variable. Imagine that we have pointers (like arrows) pointing at each of the first $x_0$ items, indicating which ones we are taking. The way we will get to an arbitrary state is by first moving the rightmost pointer to the right, one step at a time until it reaches the rightmost item we need. We will never touch this pointer again. Let’s say that we “glue it down.” Now we begin moving the second rightmost pointer up until it reaches the second rightmost item we plan on taking and glue it down. Repeat this for every pointer we have. Notice that we never have two pointers point at the same item and we never “jump” a pointer over another one as it moves right. We may also add new pointers if we need to add more pointers to reach this arbitrary state. We will add additional pointers pointing to the leftmost item only when it is not already being pointed at.

We can let our state be $(j, cnt, maxr)$. $j$ means that the rightmost pointer that hasn’t been glued down is at index $j$ from the left. When $cnt < x_0$, $cnt$ means how many pointers have been moved from their original spots. When $cnt \ge x_0$, $cnt$ means how many pointers we currently have in our state. $maxr$ is the index of the leftmost pointer that has been glued down. This is needed so that we don’t move future pointers at or beyond this point.

The only transitions we need are:

  1. Moving $j$ right by one.
  2. Gluing down the current $j$ and moving the next pointer (that is among one of the original $x_0$ ones) to the right by one.
  3. Creating a new pointer that points to the first item.

Notice that every state corresponds to a unique set of choices (a state may correctly appear multiple times, since there may be multiple ways to reach a state). We have an $\mathcal{O}(1)$ transition and we only need to visit exactly $K$ (not necessarily unique) states.

To extend our solution to work with multiple rows, we can add an $i$ to our state, representing which row we’re currently processing. We will process the rows from top to bottom and add transitions to go from the current state $(i, j, cnt, maxr)$ to a state in the following row. When we do this, we need to make sure that the new state represents a different set of pointers, otherwise our algorithm would not work. We can do this by forcing ourselves to, in addition to going to the next row, moving the rightmost pointer on the next row to the right by one. Let $base(i)$ denote the “base state” of row $i$, that which has the last cost. Let $right(i)$ denote the base state but with the rightmost pointer moved right by one. What if we want to skip a row? We can create a transition from the $right(i)$ to $right(i+1)$, but also subtracts the cost difference between $right(i)$ and $base(i)$. This essentially skips row $i$ without moving any pointers in its row. To ensure that we visit our states in non-decreasing order of cost, we need to first sort the rows based on their ${x_i+1}^\text{th}$ elements. Watch out for numerous edge cases when implementing this.

Time Complexity: $\mathcal{O}(N \log N + M \log M + K \log K)$.

Number of solves: 0/20.

Closing remarks

This blog post took me forever to write. At least I learned a few things along the way. For example, I found out that one of my passing solutions to P5 ran in $\mathcal{O}(N^2 \log N)$ rather than the intended $\mathcal{O}(N \log N)$.

Despite all the time it took to create this post, I plan on continuing with these blogs as I have plenty of time this summer. See you soon! 🙂

FOOTNOTES
  1. Data not available for 4 people. ↩︎

Please don't redistribute or adapt this work.