Table of Contents
Fetching ...

On the instance optimality of detecting collisions and subgraphs

Omri Ben-Eliezer, Tomer Grossman, Moni Naor

TL;DR

This work analyzes whether prior unlabeled knowledge about input structure helps in detecting small substructures in graphs and functions, introducing the notion of $g(n)$-instance optimality. It establishes a trichotomy: some properties are truly instance optimal, many are polynomially far, and collisions and claws sit between, with $\Omega(\log n)$-far in general and a near-IO algorithm in the merging-free regime. The key technical tool is multi scale walking, culminating in the $A_{all\text{scales}}$ strategy that achieves logarithmic factors over the certificate-aware baseline in regimes where merges are unlikely. The results illuminate how structural knowledge interacts with query complexity, with implications for subgraph detection, collision search, and related cryptographic and graph theoretic problems. The paper also discusses model robustness across various data access models and outlines open directions for tightening the bounds on the collision and claw problems.

Abstract

Suppose you are given a function $f\colon [n] \to [n]$ via (black-box) query access to the function. You are looking to find something local, like a collision (a pair $x \neq y$ s.t. $f(x)=f(y)$). The question is whether knowing the "shape" of the function helps you or not (by shape we mean that some permutation of the function is known). Formally, we investigate the unlabeled instance optimality of substructure detection problems in graphs and functions. A problem is $g(n)$-instance optimal if it admits an algorithm $A$ satisfying that for any possible input, the (randomized) query complexity of $A$ is at most $g(n)$ times larger than the query complexity of any algorithm $A'$ which solves the same problem while holding an unlabeled copy of the input (i.e., any $A'$ that "knows the structure of the input"). Our results point to a trichotomy of unlabeled instance optimality among substructure detection problems in graphs and functions: 1. A few very simple properties have an $O(1)$-instance optimal algorithm. 2. Most properties of graphs and functions, with examples such as containing a fixed point or a $3$-collision in functions, or a triangle in graphs, are $n^{Ω(1)}$-far from instance optimality. 3. The problems of collision detection in functions and finding a claw in a graph serve as a middle ground between the two regimes. We show that these two properties are $Ω(\log n)$-far from instance optimality, and conjecture that this bound is tight. We provide evidence towards this conjecture, by proving that finding a claw in a graph is $O(\log(n))$-instance optimal among all input graphs for which the query complexity of an algorithm holding an unlabeled certificate is $O\left(\sqrt{\frac{n}{\log n}}\right)$.

On the instance optimality of detecting collisions and subgraphs

TL;DR

This work analyzes whether prior unlabeled knowledge about input structure helps in detecting small substructures in graphs and functions, introducing the notion of -instance optimality. It establishes a trichotomy: some properties are truly instance optimal, many are polynomially far, and collisions and claws sit between, with -far in general and a near-IO algorithm in the merging-free regime. The key technical tool is multi scale walking, culminating in the strategy that achieves logarithmic factors over the certificate-aware baseline in regimes where merges are unlikely. The results illuminate how structural knowledge interacts with query complexity, with implications for subgraph detection, collision search, and related cryptographic and graph theoretic problems. The paper also discusses model robustness across various data access models and outlines open directions for tightening the bounds on the collision and claw problems.

Abstract

Suppose you are given a function via (black-box) query access to the function. You are looking to find something local, like a collision (a pair s.t. ). The question is whether knowing the "shape" of the function helps you or not (by shape we mean that some permutation of the function is known). Formally, we investigate the unlabeled instance optimality of substructure detection problems in graphs and functions. A problem is -instance optimal if it admits an algorithm satisfying that for any possible input, the (randomized) query complexity of is at most times larger than the query complexity of any algorithm which solves the same problem while holding an unlabeled copy of the input (i.e., any that "knows the structure of the input"). Our results point to a trichotomy of unlabeled instance optimality among substructure detection problems in graphs and functions: 1. A few very simple properties have an -instance optimal algorithm. 2. Most properties of graphs and functions, with examples such as containing a fixed point or a -collision in functions, or a triangle in graphs, are -far from instance optimality. 3. The problems of collision detection in functions and finding a claw in a graph serve as a middle ground between the two regimes. We show that these two properties are -far from instance optimality, and conjecture that this bound is tight. We provide evidence towards this conjecture, by proving that finding a claw in a graph is -instance optimal among all input graphs for which the query complexity of an algorithm holding an unlabeled certificate is .
Paper Structure (33 sections, 21 theorems, 32 equations, 2 figures)

This paper contains 33 sections, 21 theorems, 32 equations, 2 figures.

Key Result

Theorem 1.1

Let $H$ be a connected, constant-sized oriented graph with maximum outdegree $1$, and consider the function property $\mathcal{P}_H$ of containing $H$ as a substructure. Then $\mathcal{P}_H$ is

Figures (2)

  • Figure 1: There are $n^{1/4}/\log n$ cycles, where each cycle is of length $n^{3/4}$. Each path entering a cycle is of size $n^{1/4}$. The distance between every two paths on the $i$-th cycle is $p_i$.
  • Figure 2: Construction for \ref{['lemma:paths']}

Theorems & Definitions (69)

  • Definition 1: instance optimality; informal
  • Definition 2: distance from instance optimality; informal
  • Theorem 1.1: Instance optimality of substructure detection in functions
  • Theorem 1.2: Instance optimality of subgraph detection in graphs
  • Conjecture 1
  • Conjecture 2
  • Theorem 1.3: informal; see Theorem \ref{['thm:near_instance_optimality']}
  • Definition 3: Unlabeled Certificate Complexity
  • Definition 4: instance optimality
  • Definition 5: $\omega$-far from instance optimality
  • ...and 59 more