Table of Contents
Fetching ...

Estimating a graph's spectrum via random Kirchhoff forests

Simon Barthelmé, Fabienne Castell, Alexandre Gaudillière, Clothilde Melot, Matteo Quattropani, Nicolas Tremblay

TL;DR

The paper tackles scalable estimation of a graph's spectrum without full eigendecomposition. It introduces Kirchhoff forests to estimate non-linear spectral moments $h(q,k)=\mu((q/(q+\lambda))^k)$ across a grid of $q$ and small $k$, and then reconstructs the spectral density using a maximum-entropy approach on the transformed measure $\nu_q$, aggregating to a spectral cdf. The key contributions include a forest-based estimator with complexity $\mathcal{O}((\alpha+n_\lambda) s l n)$ that can be sublinear in $|\mathcal{E}|$ for modest precision, a reconstruction pipeline for the spectral cdf, and empirical results showing practical speedups over baselines in the moderate-accuracy regime. This approach enables approximate spectrum estimation on very large graphs where exact eigen-decomposition is infeasible, with potential impact for graph signal processing and spectral analyses.

Abstract

Exact eigendecomposition of large matrices is very expensive, and it is practically impossible to compute exact eigenvalues. Instead, one may set a more modest goal of approaching the empirical distribution of the eigenvalues, recovering the overall shape of the eigenspectrum. Current approaches to spectral estimation typically work with \emph{moments} of the spectral distribution. These moments are first estimated using Monte Carlo trace estimators, then the estimates are combined to approximate the spectral density. In this article we show how \emph{Kirchhoff forests}, which are random forests on graphs, can be used to estimate certain non-linear moments of very large graph Laplacians. We show how to combine these moments into an estimate of the spectral density. If the estimate's desired precision isn't too high, our approach paves the way to the estimation of a graph's spectrum in time sublinear in the number of links.

Estimating a graph's spectrum via random Kirchhoff forests

TL;DR

The paper tackles scalable estimation of a graph's spectrum without full eigendecomposition. It introduces Kirchhoff forests to estimate non-linear spectral moments across a grid of and small , and then reconstructs the spectral density using a maximum-entropy approach on the transformed measure , aggregating to a spectral cdf. The key contributions include a forest-based estimator with complexity that can be sublinear in for modest precision, a reconstruction pipeline for the spectral cdf, and empirical results showing practical speedups over baselines in the moderate-accuracy regime. This approach enables approximate spectrum estimation on very large graphs where exact eigen-decomposition is infeasible, with potential impact for graph signal processing and spectral analyses.

Abstract

Exact eigendecomposition of large matrices is very expensive, and it is practically impossible to compute exact eigenvalues. Instead, one may set a more modest goal of approaching the empirical distribution of the eigenvalues, recovering the overall shape of the eigenspectrum. Current approaches to spectral estimation typically work with \emph{moments} of the spectral distribution. These moments are first estimated using Monte Carlo trace estimators, then the estimates are combined to approximate the spectral density. In this article we show how \emph{Kirchhoff forests}, which are random forests on graphs, can be used to estimate certain non-linear moments of very large graph Laplacians. We show how to combine these moments into an estimate of the spectral density. If the estimate's desired precision isn't too high, our approach paves the way to the estimation of a graph's spectrum in time sublinear in the number of links.

Paper Structure

This paper contains 5 sections, 2 theorems, 23 equations, 3 figures.

Key Result

Lemma 2.1

Let $\boldsymbol{X}^{(1)}, \dots, \boldsymbol{X}^{(k)}$ random i.i.d. matrices with expectation $\mathbb{E}(\boldsymbol{X}) = \boldsymbol{M}$. Then

Figures (3)

  • Figure 1: (Left) Illustration of a rooted spanning forest on the 5x5 grid graph, containing 4 trees, the smallest of which is a trivial tree of size 1. Each tree contains a distinguished node called the root. Kirchhoff Forests (KFs) produce random rooted spanning forests with distribution given by eq. \ref{['eq:kirchhoff-forest']}. (Right) Illustration of a cdf estimation of a graph's spectrum, using our KF-based algorithm (for $l=1, 2, 3$ moments), on a Barabasi-Albert graph, with $n=10^3$ and $\bar{d}=20$.
  • Figure 2: Estimation error versus computation time for 3 types of methods: forests (ours, in solid blue), poly (dotted orange), and slq (dashed green). The dotted horizontal blue line is the result of our reconstruction algorithm if we feed it the exact moments (rather than the KF-estimated ones). Each column is for a different type of graph: ER stands for Erdös-Renyi, BA for Barabasi-Albert, "sparse" means average degree of 20, and "dense" means average degree of $n/10$. Each line is for a different value of the number of nodes in the graph. For each graph and each value of $n$, the time axis is normalized by the time of computation of the corresponding matrix-vector multiplication $\boldsymbol{L} \boldsymbol{x}$. Results are averaged over $50$ realizations of all three methods (that are all stochastic), and $10$ realizations of each graph (for the 4 random graphs of the list).
  • Figure 3: Computation time required to reach a $2\%$ error versus the number of nodes, for all three methods. Left: sparse BA graph. Right: dense BA graph. The time axis is normalized by the time of computation of the matrix-vector multiplication $\boldsymbol{L} \boldsymbol{x}$.

Theorems & Definitions (2)

  • Lemma 2.1
  • Theorem 2.2