On the instance optimality of detecting collisions and subgraphs
Omri Ben-Eliezer, Tomer Grossman, Moni Naor
TL;DR
This work analyzes whether prior unlabeled knowledge about input structure helps in detecting small substructures in graphs and functions, introducing the notion of $g(n)$-instance optimality. It establishes a trichotomy: some properties are truly instance optimal, many are polynomially far, and collisions and claws sit between, with $\Omega(\log n)$-far in general and a near-IO algorithm in the merging-free regime. The key technical tool is multi scale walking, culminating in the $A_{all\text{scales}}$ strategy that achieves logarithmic factors over the certificate-aware baseline in regimes where merges are unlikely. The results illuminate how structural knowledge interacts with query complexity, with implications for subgraph detection, collision search, and related cryptographic and graph theoretic problems. The paper also discusses model robustness across various data access models and outlines open directions for tightening the bounds on the collision and claw problems.
Abstract
Suppose you are given a function $f\colon [n] \to [n]$ via (black-box) query access to the function. You are looking to find something local, like a collision (a pair $x \neq y$ s.t. $f(x)=f(y)$). The question is whether knowing the "shape" of the function helps you or not (by shape we mean that some permutation of the function is known). Formally, we investigate the unlabeled instance optimality of substructure detection problems in graphs and functions. A problem is $g(n)$-instance optimal if it admits an algorithm $A$ satisfying that for any possible input, the (randomized) query complexity of $A$ is at most $g(n)$ times larger than the query complexity of any algorithm $A'$ which solves the same problem while holding an unlabeled copy of the input (i.e., any $A'$ that "knows the structure of the input"). Our results point to a trichotomy of unlabeled instance optimality among substructure detection problems in graphs and functions: 1. A few very simple properties have an $O(1)$-instance optimal algorithm. 2. Most properties of graphs and functions, with examples such as containing a fixed point or a $3$-collision in functions, or a triangle in graphs, are $n^{Ω(1)}$-far from instance optimality. 3. The problems of collision detection in functions and finding a claw in a graph serve as a middle ground between the two regimes. We show that these two properties are $Ω(\log n)$-far from instance optimality, and conjecture that this bound is tight. We provide evidence towards this conjecture, by proving that finding a claw in a graph is $O(\log(n))$-instance optimal among all input graphs for which the query complexity of an algorithm holding an unlabeled certificate is $O\left(\sqrt{\frac{n}{\log n}}\right)$.
