Table of Contents
Fetching ...

Tight Bounds for Sorting Under Partial Information

Ivor van der Hoog, Daniel Rutschmann

TL;DR

Sorting under partial information asks to recover a linear extension $L$ of a ground set $X$ given a partial order $P$, using as few linear-oracle queries as possible; the information content is captured by $e(P)$, the number of linear extensions, with the theoretical lower bound $\\log e(P)$. The authors present a subquadratic-time algorithm that, for any constant $c\\ge 1$, preprocesses $P$ in $O(n^{1+1/c})$ time and recovers $L$ with $Θ(c \\log e(P))$ linear-oracle queries in $O(c \\log e(P))$ time, plus a matching lower bound showing this trade-off is tight across preprocessing, queries, and time. The method combines a greedy chain decomposition and Huffman-like chain merging with exponential search, supported by entropy-based arguments on the incomparability graph to bound both upper and lower bounds. The results establish a tight three-way bound for sorting under partial information, offering the first subquadratic preprocessing scheme with provable optimality in query complexity and runtime for all constant trade-offs.

Abstract

Sorting has a natural generalization where the input consists of: (1) a ground set $X$ of size $n$, (2) a partial oracle $O_P$ specifying some fixed partial order $P$ on $X$ and (3) a linear oracle $O_L$ specifying a linear order $L$ that extends $P$. The goal is to recover the linear order $L$ on $X$ using the fewest number of linear oracle queries. In this problem, we measure algorithmic complexity through three metrics: oracle queries to $O_L$, oracle queries to $O_P$, and the time spent. Any algorithm requires worst-case $\log_2 e(P)$ linear oracle queries to recover the linear order on $X$. Kahn and Saks presented the first algorithm that uses $Θ(\log e(P))$ linear oracle queries (using $O(n^2)$ partial oracle queries and exponential time). The state-of-the-art for the general problem is by Cardinal, Fiorini, Joret, Jungers and Munro who at STOC'10 manage to separate the linear and partial oracle queries into a preprocessing and query phase. They can preprocess $P$ using $O(n^2)$ partial oracle queries and $O(n^{2.5})$ time. Then, given $O_L$, they uncover the linear order on $X$ in $Θ(\log e(P))$ linear oracle queries and $O(n + \log e(P))$ time -- which is worst-case optimal in the number of linear oracle queries but not in the time spent. For $c \geq 1$, our algorithm can preprocess $O_P$ using $O(n^{1 + \frac{1}{c}})$ queries and time. Given $O_L$, we uncover $L$ using $Θ(c \log e(P))$ queries and time. We show a matching lower bound, as there exist positive constants $(α, β)$ where for any constant $c \geq 1$, any algorithm that uses at most $α\cdot n^{1 + \frac{1}{c}}$ preprocessing must use worst-case at least $β\cdot c \log e(P)$ linear oracle queries. Thus, we solve the problem of sorting under partial information through an algorithm that is asymptotically tight across all three metrics.

Tight Bounds for Sorting Under Partial Information

TL;DR

Sorting under partial information asks to recover a linear extension of a ground set given a partial order , using as few linear-oracle queries as possible; the information content is captured by , the number of linear extensions, with the theoretical lower bound . The authors present a subquadratic-time algorithm that, for any constant , preprocesses in time and recovers with linear-oracle queries in time, plus a matching lower bound showing this trade-off is tight across preprocessing, queries, and time. The method combines a greedy chain decomposition and Huffman-like chain merging with exponential search, supported by entropy-based arguments on the incomparability graph to bound both upper and lower bounds. The results establish a tight three-way bound for sorting under partial information, offering the first subquadratic preprocessing scheme with provable optimality in query complexity and runtime for all constant trade-offs.

Abstract

Sorting has a natural generalization where the input consists of: (1) a ground set of size , (2) a partial oracle specifying some fixed partial order on and (3) a linear oracle specifying a linear order that extends . The goal is to recover the linear order on using the fewest number of linear oracle queries. In this problem, we measure algorithmic complexity through three metrics: oracle queries to , oracle queries to , and the time spent. Any algorithm requires worst-case linear oracle queries to recover the linear order on . Kahn and Saks presented the first algorithm that uses linear oracle queries (using partial oracle queries and exponential time). The state-of-the-art for the general problem is by Cardinal, Fiorini, Joret, Jungers and Munro who at STOC'10 manage to separate the linear and partial oracle queries into a preprocessing and query phase. They can preprocess using partial oracle queries and time. Then, given , they uncover the linear order on in linear oracle queries and time -- which is worst-case optimal in the number of linear oracle queries but not in the time spent. For , our algorithm can preprocess using queries and time. Given , we uncover using queries and time. We show a matching lower bound, as there exist positive constants where for any constant , any algorithm that uses at most preprocessing must use worst-case at least linear oracle queries. Thus, we solve the problem of sorting under partial information through an algorithm that is asymptotically tight across all three metrics.
Paper Structure (29 sections, 27 theorems, 14 equations, 1 figure, 1 table, 5 algorithms)

This paper contains 29 sections, 27 theorems, 14 equations, 1 figure, 1 table, 5 algorithms.

Key Result

Theorem 1

For every constant $c \ge 1$, there is a deterministic algorithm that first preprocesses $P$ in $O(n^{1+1/c})$ time, and then asks $O(c \log e(P))$ linear oracle queries in $O(c \log e(P))$ time to recover the linear order on $X$, represented as a leaf-linked tree on $X$.

Figures (1)

  • Figure 1: Our family of pairs of partial orders with linear extensions $\{ (P_i, L_i) \}$ can be constructed in two stages. (a): Step 1: partition the vertices $x_i$ for $i \leq n/2$ into $w$ equal-size chains. (b) For all $\ell > n/2$, arbitrarily assign $\ell$ to lie between any two connected vertices of the previous construction. All $x_\ell$ that get assigned the same pair $(x_{j+kw}, x_{j + (k+1)w})$ are linearly ordered by $\ell$. The corresponding linear order $L_i$ is obtained by adding the red edges.

Theorems & Definitions (55)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Lemma 1
  • Definition 1
  • Lemma 2
  • Lemma 3: Theorem 1.1 in kahn_entropy_1992, improved to Lemma 4 in cardinal_sorting_2010
  • Theorem 5: Theorem 2.1 in cardinal_efficient_2010, rephrased as in Theorem 1 in cardinal_sorting_2010
  • Lemma 4: Theorem 2 cardinal_sorting_2010
  • ...and 45 more