Table of Contents
Fetching ...

Spectrum Estimation through Kirchhoff Random Forests

Simon Barthelmé, Fabienne Castell, Alexandre Gaudillière, Clothilde Mélot, Matteo Quattropani, Nicolas Tremblay

TL;DR

This work introduces a Monte Carlo framework for estimating the spectral distribution of large graph Laplacians using Kirchhoff forests. By coupling replicas of Kirchhoff forests and analyzing observables tied to the Stieltjes transform, it derives unbiased moment estimators with linear-in-n cost and reconstructs the spectral CDF via a maximal-entropy approach, complemented by a moment-problem formulation. The method achieves practical spectral estimation with favorable scaling, and the authors provide proofs, numerical experiments across diverse graphs, and discussions on limitations and potential enhancements. The approach extends to general symmetric matrices via a double-cover construction, offering a sublinear pathway to spectrum-related quantities in large-scale linear-algebra problems.

Abstract

Given a non-oriented edge-weighted graph, we show how to make some estimation of the associated Laplacian eigenvalues through Monte Carlo evaluation of spectral quantities computed along Kirchhoff random rooted spanning forest trajectories. The sampling cost of this estimation is only linear in the node number, up to a logarithmic factor. By associating a double cover of such a graph with any symmetric real matrix, we can then perform spectral estimation in the same way for the latter.

Spectrum Estimation through Kirchhoff Random Forests

TL;DR

This work introduces a Monte Carlo framework for estimating the spectral distribution of large graph Laplacians using Kirchhoff forests. By coupling replicas of Kirchhoff forests and analyzing observables tied to the Stieltjes transform, it derives unbiased moment estimators with linear-in-n cost and reconstructs the spectral CDF via a maximal-entropy approach, complemented by a moment-problem formulation. The method achieves practical spectral estimation with favorable scaling, and the authors provide proofs, numerical experiments across diverse graphs, and discussions on limitations and potential enhancements. The approach extends to general symmetric matrices via a double-cover construction, offering a sublinear pathway to spectrum-related quantities in large-scale linear-algebra problems.

Abstract

Given a non-oriented edge-weighted graph, we show how to make some estimation of the associated Laplacian eigenvalues through Monte Carlo evaluation of spectral quantities computed along Kirchhoff random rooted spanning forest trajectories. The sampling cost of this estimation is only linear in the node number, up to a logarithmic factor. By associating a double cover of such a graph with any symmetric real matrix, we can then perform spectral estimation in the same way for the latter.

Paper Structure

This paper contains 45 sections, 5 theorems, 170 equations, 16 figures, 2 algorithms.

Key Result

Theorem 1

Given a preprocessing step of cost at most $O(m)$, we can obtain an unbiased estimator of $m_k(q)$ for $k \leq l$ and any $r$ values of $q \in [q_{\min},q_{\max}]$ at cost $O\left( n l \left( \frac{\textcolor{black}{\beta}}{q_{\min}} + \log \frac{\alpha}{q_{\min}} + r \right) \right)$ where $\beta

Figures (16)

  • Figure 1: Cycle erasure in the stacks viewed from the top, in a triangle. The squares in the stacks correspond to stops. $S$ is the number of arrows or stops that have to be read from the stacks to obtain a rooted forest.
  • Figure 2: Constructing the forest by reading the stacks according to Wilson's order. The light blue nodes are active, while the dark blue ones are frozen. The current node is marked by a black circle. $S$ is now the number of steps, i.e. reading a stop or an arrow, before the rooted forest is entirely constructed.
  • Figure 3: Constructing the forest by reading the stacks in random order. Unlike the Wilson's order, the current node is chosen randomly among the active roots.
  • Figure 4: Computing the extra reads in the coupled forest algorithm following Wilson's order. The dark blue trees are frozen. The light blue ones are active. The root of the current tree is marked by a black circle, and the next arrow or stop in the corresponding stack is indicated in green, a cross corresponding to a stop. New sampled arrows are in brown, as are the arrows that we have to read again in order to decide between a cycle or a grafting to an active tree. Once in brown, the decision between "cycle or grafting" does not need more extra reads: if the current root points to a brown arrow, there is a cycle; if it points to a light blue one, the current tree coalesces. The number of reread arrows $R$ is the number of times one light blue arrow becomes brown ($R= 5$ in our example).
  • Figure 5: Cumulative distribution function (dashed line) in natural (left) and log-log (right) scales of the spectral measure for the bunny graph with 2053 nodes and mean degree 52.33 together with lower (upward triangles) and upper (downward triangles) bounds computed from Monte Carlo estimation of $m_1$, …, $m_4$ after sampling 400 replicated forest trajectories up to time $1 / q_0$ with $q_0 = \bar{\lambda} / 100$ and with 1 (yellow), 2 (green), 3 (cyan) or 4 (blue) valid moment estimates.
  • ...and 11 more figures

Theorems & Definitions (5)

  • Theorem 1: Informal
  • Theorem 2
  • Theorem 3
  • Lemma C.1
  • Theorem