Nearly Optimal List Labeling

Michael A. Bender; Alex Conway; Martín Farach-Colton; Hanna Komlós; Michal Koucký; William Kuszmaul; Michael Saks

Nearly Optimal List Labeling

Michael A. Bender, Alex Conway, Martín Farach-Colton, Hanna Komlós, Michal Koucký, William Kuszmaul, Michael Saks

TL;DR

The paper tackles the dynamic list-labeling problem: maintaining a sorted set of up to $n$ elements in an array of size $m=(1+Θ(1))n$ under online insertions/deletions with minimal moves. It introduces the See-Saw Algorithm, a randomized, history-dependent data-structure that partitions the array into a recursive subproblem tree, uses random rebuild windows, and adaptively allocates slots to subproblems based on past insertion patterns. A central contribution is a near-optimal amortized bound of $O\left(\log n\,\operatorname{polyloglog} n\right)$ per operation, matching the known lower bound up to polyloglog factors, and achieved by integrating random window sizes with adaptive array skews through See-Saw Lemma-based analysis. The work also establishes a suite of reductions and detailed probabilistic analyses to bound rebuild costs, resets, and the likelihood of expensive leaves, showing that the See-Saw approach attains near-worst-case optimal performance for this classic problem and has potential implications for cache-oblivious structures and related dynamic data-structure problems.

Abstract

The list-labeling problem captures the basic task of storing a dynamically changing set of up to $n$ elements in sorted order in an array of size $m = (1 + Θ(1))n$. The goal is to support insertions and deletions while moving around elements within the array as little as possible. Until recently, the best known upper bound stood at $O(\log^2 n)$ amortized cost. This bound, which was first established in 1981, was finally improved two years ago, when a randomized $O(\log^{3/2} n)$ expected-cost algorithm was discovered. The best randomized lower bound for this problem remains $Ω(\log n)$, and closing this gap is considered to be a major open problem in data structures. In this paper, we present the See-Saw Algorithm, a randomized list-labeling solution that achieves a nearly optimal bound of $O(\log n \operatorname{polyloglog} n)$ amortized expected cost. This bound is achieved despite at least three lower bounds showing that this type of result is impossible for large classes of solutions.

Nearly Optimal List Labeling

TL;DR

The paper tackles the dynamic list-labeling problem: maintaining a sorted set of up to

elements in an array of size

under online insertions/deletions with minimal moves. It introduces the See-Saw Algorithm, a randomized, history-dependent data-structure that partitions the array into a recursive subproblem tree, uses random rebuild windows, and adaptively allocates slots to subproblems based on past insertion patterns. A central contribution is a near-optimal amortized bound of

per operation, matching the known lower bound up to polyloglog factors, and achieved by integrating random window sizes with adaptive array skews through See-Saw Lemma-based analysis. The work also establishes a suite of reductions and detailed probabilistic analyses to bound rebuild costs, resets, and the likelihood of expensive leaves, showing that the See-Saw approach attains near-worst-case optimal performance for this classic problem and has potential implications for cache-oblivious structures and related dynamic data-structure problems.

Abstract

The list-labeling problem captures the basic task of storing a dynamically changing set of up to

elements in sorted order in an array of size

. The goal is to support insertions and deletions while moving around elements within the array as little as possible. Until recently, the best known upper bound stood at

amortized cost. This bound, which was first established in 1981, was finally improved two years ago, when a randomized

expected-cost algorithm was discovered. The best randomized lower bound for this problem remains

, and closing this gap is considered to be a major open problem in data structures. In this paper, we present the See-Saw Algorithm, a randomized list-labeling solution that achieves a nearly optimal bound of

amortized expected cost. This bound is achieved despite at least three lower bounds showing that this type of result is impossible for large classes of solutions.

Paper Structure (42 sections, 28 theorems, 60 equations)

This paper contains 42 sections, 28 theorems, 60 equations.

Introduction
Past upper and lower bounds.
This paper: nearly optimal list labeling.
A remark on other parameter regimes.
Implications to other algorithmic problems.
Paper outline.
Preliminaries
Defining the list-labeling problem.
Conventions.
Main Results
The See-Saw Algorithm
Defining a subproblem tree.
How an insertion decides its root-to-leaf path.
Implementing leaves.
Initializing a subtree.
...and 27 more sections

Key Result

Theorem 1

For $\delta \in (0, 1)$, and $m = (1 + \delta)n$, there is a solution to the list-labeling problem on an array of size $m$, and with up to $n$ elements present at a time, that supports amortized expected cost $O(\delta^{-1} (\log n) (\log \log n)^3)$ per insertion and deletion.

Theorems & Definitions (52)

Theorem 1
Corollary 2
Theorem 3
Corollary 3
Theorem 3
Proposition 4
proof
Corollary 5
proof
Lemma 6
...and 42 more

Nearly Optimal List Labeling

TL;DR

Abstract

Nearly Optimal List Labeling

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (52)