Tight Better-Than-Worst-Case Bounds for Element Distinctness and Set Intersection

Ivor van der Hoog; Eva Rotenberg; Daniel Rutschmann

Tight Better-Than-Worst-Case Bounds for Element Distinctness and Set Intersection

Ivor van der Hoog, Eva Rotenberg, Daniel Rutschmann

TL;DR

The paper tackles element distinctness and set intersection in the comparison-based model, where the classical worst-case bound $\Omega(n \log n)$ becomes informative only when the input has few duplicates. It introduces a universal optimality framework by encoding input duplication as a graph $G(I)$ (a union of cliques) and proving instance-sensitive lower bounds, alongside adaptive algorithms that match these bounds up to constants. The main results show a tight $\Theta(\log\log n)$-competitive bound for element distinctness and a $\Theta(\log n)$-competitive bound for set intersection, with an accompanying preprocessing variant achieving $O(1)$-competitiveness for fixed input structures. This establishes a clear separation between the two problems under input structure constraints and provides a comprehensive framework for better-than-worst-case analysis in classic combinatorial problems.

Abstract

The element distinctness problem takes as input a list $I$ of $n$ values from a totally ordered universe and the goal is to decide whether $I$ contains any duplicates. It is a well-studied problem with a classical worst-case $Ω(n \log n)$ comparison-based lower bound by Fredman. At first glance, this lower bound appears to rule out any algorithm more efficient than the naive approach of sorting $I$ and comparing adjacent elements. However, upon closer inspection, the $Ω(n \log n)$ bound does not apply if the input has many duplicates. We therefore ask: Are there comparison-based lower bounds for element distinctness that are sensitive to the amount of duplicates in the input? To address this question, we derive instance-specific lower bounds. For any input instance $I$, we represent the combinatorial structure of the duplicates in $I$ by an undirected graph $G(I)$ that connects identical elements. Each such graph $G$ is a union of cliques, and we study algorithms by their worst-case running time over all inputs $I'$ with $G(I') \cong G$. We establish an adversarial lower bound showing that, for any deterministic algorithm $\mathcal{A}$, there exists a graph $G$ and an algorithm $\mathcal{A}'$ that, for all inputs $I$ with $G(I) \cong G$, is a factor $O(\log \log n)$ faster than $\mathcal{A}$. Consequently, no deterministic algorithm can be $o(\log \log n)$-competitive for all graphs $G$. We complement this with an $O(\log \log n)$-competitive deterministic algorithm, thereby obtaining tight bounds for element distinctness that go beyond classical worst-case analysis. We subsequently study the related problem of set intersection. We show that no deterministic set intersection algorithm can be $o(\log n)$-competitive, and provide an $O(\log n)$-competitive deterministic algorithm. This shows a separation between element distinctness and the set intersection problem.

Tight Better-Than-Worst-Case Bounds for Element Distinctness and Set Intersection

TL;DR

The paper tackles element distinctness and set intersection in the comparison-based model, where the classical worst-case bound

becomes informative only when the input has few duplicates. It introduces a universal optimality framework by encoding input duplication as a graph

(a union of cliques) and proving instance-sensitive lower bounds, alongside adaptive algorithms that match these bounds up to constants. The main results show a tight

-competitive bound for element distinctness and a

-competitive bound for set intersection, with an accompanying preprocessing variant achieving

-competitiveness for fixed input structures. This establishes a clear separation between the two problems under input structure constraints and provides a comprehensive framework for better-than-worst-case analysis in classic combinatorial problems.

Abstract

The element distinctness problem takes as input a list

values from a totally ordered universe and the goal is to decide whether

contains any duplicates. It is a well-studied problem with a classical worst-case

comparison-based lower bound by Fredman. At first glance, this lower bound appears to rule out any algorithm more efficient than the naive approach of sorting

and comparing adjacent elements. However, upon closer inspection, the

bound does not apply if the input has many duplicates. We therefore ask: Are there comparison-based lower bounds for element distinctness that are sensitive to the amount of duplicates in the input? To address this question, we derive instance-specific lower bounds. For any input instance

, we represent the combinatorial structure of the duplicates in

by an undirected graph

that connects identical elements. Each such graph

is a union of cliques, and we study algorithms by their worst-case running time over all inputs

with

. We establish an adversarial lower bound showing that, for any deterministic algorithm

, there exists a graph

and an algorithm

that, for all inputs

with

, is a factor

faster than

. Consequently, no deterministic algorithm can be

-competitive for all graphs

. We complement this with an

-competitive deterministic algorithm, thereby obtaining tight bounds for element distinctness that go beyond classical worst-case analysis. We subsequently study the related problem of set intersection. We show that no deterministic set intersection algorithm can be

-competitive, and provide an

-competitive deterministic algorithm. This shows a separation between element distinctness and the set intersection problem.

Tight Better-Than-Worst-Case Bounds for Element Distinctness and Set Intersection

TL;DR

Abstract

Tight Better-Than-Worst-Case Bounds for Element Distinctness and Set Intersection

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (40)