On the instance optimality of detecting collisions and subgraphs

Omri Ben-Eliezer; Tomer Grossman; Moni Naor

On the instance optimality of detecting collisions and subgraphs

Omri Ben-Eliezer, Tomer Grossman, Moni Naor

TL;DR

This work analyzes whether prior unlabeled knowledge about input structure helps in detecting small substructures in graphs and functions, introducing the notion of $g(n)$-instance optimality. It establishes a trichotomy: some properties are truly instance optimal, many are polynomially far, and collisions and claws sit between, with $\Omega(\log n)$-far in general and a near-IO algorithm in the merging-free regime. The key technical tool is multi scale walking, culminating in the $A_{all\text{scales}}$ strategy that achieves logarithmic factors over the certificate-aware baseline in regimes where merges are unlikely. The results illuminate how structural knowledge interacts with query complexity, with implications for subgraph detection, collision search, and related cryptographic and graph theoretic problems. The paper also discusses model robustness across various data access models and outlines open directions for tightening the bounds on the collision and claw problems.

Abstract

Suppose you are given a function $f\colon [n] \to [n]$ via (black-box) query access to the function. You are looking to find something local, like a collision (a pair $x \neq y$ s.t. $f(x)=f(y)$). The question is whether knowing the "shape" of the function helps you or not (by shape we mean that some permutation of the function is known). Formally, we investigate the unlabeled instance optimality of substructure detection problems in graphs and functions. A problem is $g(n)$-instance optimal if it admits an algorithm $A$ satisfying that for any possible input, the (randomized) query complexity of $A$ is at most $g(n)$ times larger than the query complexity of any algorithm $A'$ which solves the same problem while holding an unlabeled copy of the input (i.e., any $A'$ that "knows the structure of the input"). Our results point to a trichotomy of unlabeled instance optimality among substructure detection problems in graphs and functions: 1. A few very simple properties have an $O(1)$-instance optimal algorithm. 2. Most properties of graphs and functions, with examples such as containing a fixed point or a $3$-collision in functions, or a triangle in graphs, are $n^{Ω(1)}$-far from instance optimality. 3. The problems of collision detection in functions and finding a claw in a graph serve as a middle ground between the two regimes. We show that these two properties are $Ω(\log n)$-far from instance optimality, and conjecture that this bound is tight. We provide evidence towards this conjecture, by proving that finding a claw in a graph is $O(\log(n))$-instance optimal among all input graphs for which the query complexity of an algorithm holding an unlabeled certificate is $O\left(\sqrt{\frac{n}{\log n}}\right)$.

On the instance optimality of detecting collisions and subgraphs

TL;DR

This work analyzes whether prior unlabeled knowledge about input structure helps in detecting small substructures in graphs and functions, introducing the notion of

-instance optimality. It establishes a trichotomy: some properties are truly instance optimal, many are polynomially far, and collisions and claws sit between, with

-far in general and a near-IO algorithm in the merging-free regime. The key technical tool is multi scale walking, culminating in the

strategy that achieves logarithmic factors over the certificate-aware baseline in regimes where merges are unlikely. The results illuminate how structural knowledge interacts with query complexity, with implications for subgraph detection, collision search, and related cryptographic and graph theoretic problems. The paper also discusses model robustness across various data access models and outlines open directions for tightening the bounds on the collision and claw problems.

Abstract

Suppose you are given a function

via (black-box) query access to the function. You are looking to find something local, like a collision (a pair

s.t.

). The question is whether knowing the "shape" of the function helps you or not (by shape we mean that some permutation of the function is known). Formally, we investigate the unlabeled instance optimality of substructure detection problems in graphs and functions. A problem is

-instance optimal if it admits an algorithm

satisfying that for any possible input, the (randomized) query complexity of

is at most

times larger than the query complexity of any algorithm

which solves the same problem while holding an unlabeled copy of the input (i.e., any

that "knows the structure of the input"). Our results point to a trichotomy of unlabeled instance optimality among substructure detection problems in graphs and functions: 1. A few very simple properties have an

-instance optimal algorithm. 2. Most properties of graphs and functions, with examples such as containing a fixed point or a

-collision in functions, or a triangle in graphs, are

-far from instance optimality. 3. The problems of collision detection in functions and finding a claw in a graph serve as a middle ground between the two regimes. We show that these two properties are

-far from instance optimality, and conjecture that this bound is tight. We provide evidence towards this conjecture, by proving that finding a claw in a graph is

-instance optimal among all input graphs for which the query complexity of an algorithm holding an unlabeled certificate is

Paper Structure (33 sections, 21 theorems, 32 equations, 2 figures)

This paper contains 33 sections, 21 theorems, 32 equations, 2 figures.

Introduction
Substructure detection problems.
Models.
Main Results and Discussion
Almost instance optimality of claws and collisions?
The challenge of merging walks.
Model robustness.
Technical Overview: Collisions and Fixed Points
Fixed point detection: Polynomially far from instance optimality
Finding Collisions: $\Omega(\log n)$ far from instance optimality
Finding claws: $O(\log n)$-close to instance optimality in merging-free regime
Merging without claws requires $\Omega\left(\sqrt{\frac{n}{\log n}}\right)$ queries.
Asymptotic stochastic dominance of $A_{\text{all-scales}}$
Related Work
Unlabeled instance optimality.
...and 18 more sections

Key Result

Theorem 1.1

Let $H$ be a connected, constant-sized oriented graph with maximum outdegree $1$, and consider the function property $\mathcal{P}_H$ of containing $H$ as a substructure. Then $\mathcal{P}_H$ is

Figures (2)

Figure 1: There are $n^{1/4}/\log n$ cycles, where each cycle is of length $n^{3/4}$. Each path entering a cycle is of size $n^{1/4}$. The distance between every two paths on the $i$-th cycle is $p_i$.
Figure 2: Construction for \ref{['lemma:paths']}

Theorems & Definitions (69)

Definition 1: instance optimality; informal
Definition 2: distance from instance optimality; informal
Theorem 1.1: Instance optimality of substructure detection in functions
Theorem 1.2: Instance optimality of subgraph detection in graphs
Conjecture 1
Conjecture 2
Theorem 1.3: informal; see Theorem \ref{['thm:near_instance_optimality']}
Definition 3: Unlabeled Certificate Complexity
Definition 4: instance optimality
Definition 5: $\omega$-far from instance optimality
...and 59 more

On the instance optimality of detecting collisions and subgraphs

TL;DR

Abstract

On the instance optimality of detecting collisions and subgraphs

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (69)