Finding planted cliques using gradient descent

Reza Gheissari; Aukosh Jagannath; Yiming Xu

Finding planted cliques using gradient descent

Reza Gheissari, Aukosh Jagannath, Yiming Xu

TL;DR

This work shows that a black-box optimization approach based on a Lagrange-multiplier Hamiltonian H_G,γ(U) can recover planted cliques at the information-theoretic √n scale when allowed to optimize over all subgraphs and initialized from the full vertex set. The authors analyze both gradient descent and a low-temperature Gibbs sampler, proving that the planted clique is the global minimizer for γ>1 and that the full-graph initialization enables efficient recovery in O(n) steps, while the natural empty initialization fails due to entropic barriers. They characterize a rugged energy landscape with many small local minima and demonstrate sharp initialization-dependent behavior, complemented by a robustness extension to semi-random contaminations. The results bridge a gap between problem-specific algorithms and black-box optimization, offering insight into initialization sensitivity, landscape intricacy, and applicability to robust variants in planted-clique-like settings.

Abstract

The planted clique problem is a paradigmatic model of statistical-to-computational gaps: the planted clique is information-theoretically detectable if its size $k\ge 2\log_2 n$ but polynomial-time algorithms only exist for the recovery task when $k= Ω(\sqrt{n})$. By now, there are many algorithms that succeed as soon as $k = Ω(\sqrt{n})$. Glaringly, however, no black-box optimization method, e.g., gradient descent or the Metropolis process, has been shown to work. In fact, Chen, Mossel, and Zadik recently showed that any Metropolis process whose state space is the set of cliques fails to find any sub-linear sized planted clique in polynomial time if initialized naturally from the empty set. We show that using the method of Lagrange multipliers, namely optimizing the Hamiltonian given by the sum of the objective function and the clique constraint over the space of all subgraphs, succeeds. In particular, we prove that Markov chains which minimize this Hamiltonian (gradient descent and a low-temperature relaxation of it) succeed at recovering planted cliques of size $k = Ω(\sqrt{n})$ if initialized from the full graph. Importantly, initialized from the empty set, the relaxation still does not help the gradient descent find sub-linear planted cliques. We also demonstrate robustness of these Markov chain approaches under a natural contamination model.

Finding planted cliques using gradient descent

TL;DR

Abstract

The planted clique problem is a paradigmatic model of statistical-to-computational gaps: the planted clique is information-theoretically detectable if its size

but polynomial-time algorithms only exist for the recovery task when

. By now, there are many algorithms that succeed as soon as

. Glaringly, however, no black-box optimization method, e.g., gradient descent or the Metropolis process, has been shown to work. In fact, Chen, Mossel, and Zadik recently showed that any Metropolis process whose state space is the set of cliques fails to find any sub-linear sized planted clique in polynomial time if initialized naturally from the empty set. We show that using the method of Lagrange multipliers, namely optimizing the Hamiltonian given by the sum of the objective function and the clique constraint over the space of all subgraphs, succeeds. In particular, we prove that Markov chains which minimize this Hamiltonian (gradient descent and a low-temperature relaxation of it) succeed at recovering planted cliques of size

if initialized from the full graph. Importantly, initialized from the empty set, the relaxation still does not help the gradient descent find sub-linear planted cliques. We also demonstrate robustness of these Markov chain approaches under a natural contamination model.

Paper Structure (14 sections, 19 theorems, 81 equations, 2 figures)

This paper contains 14 sections, 19 theorems, 81 equations, 2 figures.

Introduction
Main results
The importance of the initialization
Robustness to adversary
The energy landscape
Planted clique is the global energy minimizer
Recovery from the full-graph initialization
Failure from the empty-set initialization
Deferred proofs
Degree concentration
Deferred proofs for success from full initialization
Deferred proofs for failure from empty initialization
Proof of landscape complexity
Extension to the robust case

Key Result

Theorem 1.1

Suppose $\gamma>3$. For every ${\varepsilon}>0$, there exists $C({\varepsilon},\gamma)>0$ such that for all $k\geq C\sqrt{n}$, with probability at least $1-{\varepsilon}$, the gradient descent $S_t$ initialized from $S_0 = [n]$ achieves The same holds for the low-temperature chain $S_t^\beta$ for all $n+2k \le t \le n^{k/C}$ if $\beta \ge C \log n$.

Figures (2)

Figure 1: Simulated trajectories of the relative ratio of ${\mathcal{PC}}$ and non-${\mathcal{PC}}$ vertices in $S_t$ while applying the gradient descent with full-graph initialization (left) and empty-set initialization (right) to find the planted clique in ${\mathsf{G}}(5000, \frac{1}{2}, 70)$ under different values of $\gamma$. The $x$-axis is the overlap with ${\mathcal{PC}}$ and the $y$-axis is the overlap with ${\mathcal{PC}}^c$. Both trajectories in the left plot start from $(1, 1)$ and terminate in ${\mathcal{PC}}$, whereas both trajectories in the right plot start from $(0, 0)$ and stop at local minima that do not have any overlap with ${\mathcal{PC}}$.
Figure 2: Phase diagram in terms of $U\cap {\mathcal{PC}}$ and $U\cap {\mathcal{PC}}^c$, depicting the regions where $H(U)$ has complexity and admits local minima. The global minimum (circled red) is exactly ${\mathcal{PC}}$.

Theorems & Definitions (35)

Theorem 1.1
Remark 1.2
Theorem 1.3
Theorem 2.1: Energy landscape: global minimum
Theorem 2.2: Energy landscape: local minima
Lemma 2.3
Lemma 2.4
proof : Proof of Theorem \ref{['thm:global optimum']}
Remark 2.5
Lemma 3.1
...and 25 more

Finding planted cliques using gradient descent

TL;DR

Abstract

Finding planted cliques using gradient descent

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (35)