Complexity of Classical Acceleration for $\ell_1$-Regularized PageRank

Kimon Fountoulakis; David Martínez-Rubio

Complexity of Classical Acceleration for $\ell_1$-Regularized PageRank

Kimon Fountoulakis, David Martínez-Rubio

TL;DR

This work analyze FISTA on a slightly over-regularized objective and shows that, under a checkable confinement condition, all spurious activations remain inside a boundary set $\mathcal{B}$ and provides graph-structural conditions that imply such confinement.

Abstract

We study the degree-weighted work required to compute $\ell_1$-regularized PageRank using the standard one-gradient-per-iteration accelerated proximal-gradient method (FISTA). For non-accelerated local methods, the best known worst-case work scales as $\widetilde{O} ((αρ)^{-1})$, where $α$ is the teleportation parameter and $ρ$ is the $\ell_1$-regularization parameter. A natural question is whether FISTA can improve the dependence on $α$ from $1/α$ to $1/\sqrtα$ while preserving the $1/ρ$ locality scaling. The challenge is that acceleration can break locality by transiently activating nodes that are zero at optimality, thereby increasing the cost of gradient evaluations. We analyze FISTA on a slightly over-regularized objective and show that, under a checkable confinement condition, all spurious activations remain inside a boundary set $\mathcal{B}$. This yields a bound consisting of an accelerated $(ρ\sqrtα)^{-1}\log(α/\varepsilon)$ term plus a boundary overhead $\sqrt{vol(\mathcal{B})}/(ρα^{3/2})$. We provide graph-structural conditions that imply such confinement. Experiments on synthetic and real graphs show the resulting speedup and slowdown regimes under the degree-weighted work model.

Complexity of Classical Acceleration for $\ell_1$-Regularized PageRank

TL;DR

This work analyze FISTA on a slightly over-regularized objective and shows that, under a checkable confinement condition, all spurious activations remain inside a boundary set

and provides graph-structural conditions that imply such confinement.

Abstract

We study the degree-weighted work required to compute

-regularized PageRank using the standard one-gradient-per-iteration accelerated proximal-gradient method (FISTA). For non-accelerated local methods, the best known worst-case work scales as

, where

is the teleportation parameter and

is the

-regularization parameter. A natural question is whether FISTA can improve the dependence on

from

while preserving the

locality scaling. The challenge is that acceleration can break locality by transiently activating nodes that are zero at optimality, thereby increasing the cost of gradient evaluations. We analyze FISTA on a slightly over-regularized objective and show that, under a checkable confinement condition, all spurious activations remain inside a boundary set

. This yields a bound consisting of an accelerated

term plus a boundary overhead

. We provide graph-structural conditions that imply such confinement. Experiments on synthetic and real graphs show the resulting speedup and slowdown regimes under the degree-weighted work model.

Paper Structure (33 sections, 11 theorems, 77 equations, 8 figures)

This paper contains 33 sections, 11 theorems, 77 equations, 8 figures.

Introduction
Related work
Preliminaries and notation
The FISTA Algorithm
FISTA's work analysis in RPPR under ell1 over-regularization
Over-regularization
Complementarity slack and spurious activations
Work bound and sufficient conditions
Experiments
Synthetic boundary-volume sweep experiment
Sweeps in rho, alpha and epsilon at fixed boundary size
Real-data benchmarks on SNAP graphs
Conclusion, limitations and future work
Proofs
High-degree nodes do not activate
...and 18 more sections

Key Result

Lemma 4.1

[proof:lem:coord_jump] Fix $y\in\mathbb{R}^n$. For every $i\in A(y)$, $|u(y)_i-u(x^\star)_i| > \eta \gamma_i\sqrt{d_i}$.

Figures (8)

Figure 1: Adjacency density. For each boundary size $|\mathcal{B}|$ we visualize the adjacency matrix via a binned edge-density heatmap (bin size $20$), where each pixel shows the fraction of possible edges between a pair of bins (log-scaled; colormap magma with white below $10^{-4}$). Dashed lines mark the core | boundary | exterior block boundaries. The plots show the clique (upper-left block), the boundary circulant band, the nearly dense exterior block, and the sparse cross-region interfaces.
Figure 2: Work vs. $\operatorname{vol}(\mathcal{B})$. Work by ISTA and FISTA against $\operatorname{vol}(\mathcal{B})$.
Figure 3: Sweeps at fixed $|\mathcal{B}|=600$.\ref{['fig:B600_work_vs_rho_dense']} shows the $\rho$-sweep with a dense core (clique) on a fresh randomized graph per $\rho$; \ref{['fig:B600_work_vs_rho_sparse']} shows the $\rho$-sweep with a sparse core (connected, $20\%$ of clique edges) on a fresh randomized graph per $\rho$. \ref{['fig:B600_work_vs_alpha']} sweeps $\alpha$ at a fixed residual tolerance $\varepsilon=10^{-6}$ on a single instance constructed to satisfy $\xi>0$ and the no-percolation condition at the smallest swept value, with parameters selected by an inexpensive auto-tuning step (\ref{['app:b600_sweeps_full']}). \ref{['fig:B600_work_vs_epsilon']} sweeps the tolerance $\varepsilon$ at fixed $\alpha=0.20$ on the baseline unweighted instance.
Figure 4: Real graphs: work vs. $\alpha$. Work to reach tolerance $10^{-8}$ as a function of $\alpha$, with $\rho=10^{-4}$ fixed. Curves show mean over $300$ random seeds; shaded bands are interquartile ranges.
Figure 5: Real graphs: work vs. KKT tolerance. Work to reach $\varepsilon$, with $\alpha=0.20$ and $\rho=10^{-4}$ fixed. Curves show mean over $300$ random seeds; shaded bands are interquartile ranges.
...and 3 more figures

Theorems & Definitions (14)

Definition 3.1
Lemma 4.1
Lemma 4.2
Theorem 4.3
Theorem 4.4
Remark 4.5
Remark 4.6
Lemma A.1: Initial gap
Corollary A.3: FISTA iterates
Lemma A.4: Monotonicity of the $\ell_1$-regularized PageRank path
...and 4 more

Complexity of Classical Acceleration for $\ell_1$-Regularized PageRank

TL;DR

Abstract

Complexity of Classical Acceleration for $\ell_1$-Regularized PageRank

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (14)