Nearly Optimal Bounds for Stochastic Online Sorting
Yang Hu
TL;DR
This work resolves the stochastic online sorting problem by showing that the expected cost can be driven down to near-logarithmic scale. The authors introduce two core techniques—adaptive allocation and segment synchronization with dampening buffers—that are integrated into a recursive final algorithm. They prove a nearly tight upper bound of $E[\text{cost}] = \log n\cdot 2^{O(\log^* n)}$ and a matching lower bound of $Ω(\log n)$, with an additional high-probability polylog bound in a non-recursive variant. This establishes near-optimal performance for stochastic online sorting and offers techniques potentially applicable to hashing and other online assignment problems.
Abstract
In the online sorting problem, we have an array $A$ of $n$ cells, and receive a stream of $n$ items $x_1,\dots,x_n\in [0,1]$. When an item arrives, we need to immediately and irrevocably place it into an empty cell. The goal is to minimize the sum of absolute differences between adjacent items, which is called the \emph{cost} of the algorithm. It has been shown by Aamand, Abrahamsen, Beretta, and Kleist (SODA 2023) that when the stream $x_1,\dots,x_n$ is generated adversarially, the optimal cost bound for any deterministic algorithm is $Θ(\sqrt{n})$. In this paper, we study the stochastic version of online sorting, where the input items $x_1,\dots,x_n$ are sampled uniformly at random. Despite the intuition that the stochastic version should yield much better cost bounds, the previous best algorithm for stochastic online sorting by Abrahamsen, Bercea, Beretta, Klausen and Kozma (ESA 2024) only achieves $\tilde{O}(n^{1/4})$ cost, which seems far from optimal. We show that stochastic online sorting indeed allows for much more efficient algorithms, by presenting an algorithm that achieves expected cost $\log n\cdot 2^{O(\log^* n)}$. We also prove a cost lower bound of $Ω(\log n)$, thus show that our algorithm is nearly optimal.
