Robust Algorithms for Finding Cliques in Random Intersection Graphs via Sum-of-Squares
Andreas Göbel, Janosch Ruff, Leon Schiller
TL;DR
This work investigates dense random intersection graphs in which many overlapping cliques are planted, and edges outside the planted cliques are noisy. The authors develop a proofs-to-algorithms framework powered by the sum-of-squares (SoS) hierarchy to achieve exact and approximate recovery of ground-truth cliques, while proving robust identifiability via a single-label clique theorem. They show that exact recovery is possible in polynomial time against monotone adversaries when the planted clique size satisfies ${k \gg \sqrt{n\log n}}$, and they obtain near-optimal approximate recovery under up to ${\varepsilon k^2}$ edge corruptions; they also derive constant-degree SoS certificates that refute large cliques and certify the absence of extraneous large cliques. The results reveal computational-statistical and detection-recovery gaps in certain dense regimes and establish robust, certifiable guarantees for both recovery and refutation, positioning SoS as a powerful tool for overlapping community detection in high-dimensional latent-structure models. The techniques integrate balancedness certificates, neighborhood-reduction arguments, and pseudo-concentration to handle adversaries, offering a path toward certifiable recovery in complex overlapping-structure graphs with practical algorithmic implications.
Abstract
We study efficient algorithms for recovering cliques in dense random intersection graphs (RIGs). In this model, $d = n^{Ω(1)}$ cliques of size approximately $k$ are randomly planted by choosing the vertices to participate in each clique independently with probability $δ$. While there has been extensive work on recovering one, or multiple disjointly planted cliques in random graphs, the natural extension of this question to recovering overlapping cliques has been, surprisingly, largely unexplored. Moreover, because every vertex can be part of polynomially many cliques, this task is significantly harder than in case of disjointly planted cliques (as recently studied by Kothari, Vempala, Wein and Xu [COLT'23]) and manifests in the failure of simple combinatorial and even spectral algorithms. In this work we obtain the first efficient algorithms for recovering the community structure of RIGs both from the perspective of exact and approximate recovery. Our algorithms are further robust to noise, monotone adversaries, a certain, optimal number of edge corruptions, and work whenever $k \gg \sqrt{n \log(n)}$. Our techniques follow the proofs-to-algorithms framework utilizing the sum-of-squares hierarchy.
