Table of Contents
Fetching ...

Transfer operators on graphs: Spectral clustering and beyond

Stefan Klus, Maia Trower

TL;DR

It is shown that spectral clustering of undirected graphs can be interpreted in terms of eigenfunctions of the Koopman operator and proposed novel clustering algorithms for directed graphs based on generalized transfer operators are proposed.

Abstract

Graphs and networks play an important role in modeling and analyzing complex interconnected systems such as transportation networks, integrated circuits, power grids, citation graphs, and biological and artificial neural networks. Graph clustering algorithms can be used to detect groups of strongly connected vertices and to derive coarse-grained models. We define transfer operators such as the Koopman operator and the Perron-Frobenius operator on graphs, study their spectral properties, introduce Galerkin projections of these operators, and illustrate how reduced representations can be estimated from data. In particular, we show that spectral clustering of undirected graphs can be interpreted in terms of eigenfunctions of the Koopman operator and propose novel clustering algorithms for directed graphs based on generalized transfer operators. We demonstrate the efficacy of the resulting algorithms on several benchmark problems and provide different interpretations of clusters.

Transfer operators on graphs: Spectral clustering and beyond

TL;DR

It is shown that spectral clustering of undirected graphs can be interpreted in terms of eigenfunctions of the Koopman operator and proposed novel clustering algorithms for directed graphs based on generalized transfer operators are proposed.

Abstract

Graphs and networks play an important role in modeling and analyzing complex interconnected systems such as transportation networks, integrated circuits, power grids, citation graphs, and biological and artificial neural networks. Graph clustering algorithms can be used to detect groups of strongly connected vertices and to derive coarse-grained models. We define transfer operators such as the Koopman operator and the Perron-Frobenius operator on graphs, study their spectral properties, introduce Galerkin projections of these operators, and illustrate how reduced representations can be estimated from data. In particular, we show that spectral clustering of undirected graphs can be interpreted in terms of eigenfunctions of the Koopman operator and propose novel clustering algorithms for directed graphs based on generalized transfer operators. We demonstrate the efficacy of the resulting algorithms on several benchmark problems and provide different interpretations of clusters.
Paper Structure (22 sections, 7 theorems, 45 equations, 8 figures, 1 table)

This paper contains 22 sections, 7 theorems, 45 equations, 8 figures, 1 table.

Key Result

Lemma 2.12

It holds that the entry $f_{ij}$ of the matrix $F$ is nonzero if and only if there exists an index $k$ such that $(\mathpzc{v}{[t]{\mathstrut}}_{i}, \mathpzc{v}{[t]{\mathstrut}}_{k}) \in \mathpzc{E}{[t]{\mathstrut}}_{}$ and $(\mathpzc{v}{[t]{\mathstrut}}_{j}, \mathpzc{v}{[t]{\mathstrut}}_{k}) \in \m

Figures (8)

  • Figure 1: (a) Two different sets of random walkers starting in $\{ \mathpzc{v}{[t]{\mathstrut}}_{1}, \dots, \mathpzc{v}{[t]{\mathstrut}}_{4} \}$ and $\{ \mathpzc{v}{[t]{\mathstrut}}_{7}, \dots, \mathpzc{v}{[t]{\mathstrut}}_{10} \}$ marked in green and red, respectively. The weight of the solid edges is 1 and the weight of the dashed edges is 0.01. (b) Positions after one step forward. Only one green random walker leaves the set. (c) Positions after one step forward and one step backward. The green set is coherent---only two random walkers escaped---, whereas the red set is clearly less invariant under the forward--backward dynamics.
  • Figure 2: (a) Forward--backward edge between $\mathpzc{v}{[t]{\mathstrut}}_{i}$ and $\mathpzc{v}{[t]{\mathstrut}}_{j}$. (b) Backward--forward edge between $\mathpzc{v}{[t]{\mathstrut}}_{i}$ and $\mathpzc{v}{[t]{\mathstrut}}_{j}$.
  • Figure 3: (a) Directed graph with three clusters. Self-loops are omitted for the sake of clarity. (b) Graph structure of the matrix $F$ for uniform $\mu$. (c) Graph structure of the matrix $B$ for uniform $\nu$. The dashed edges have again a low weight. Clustering the dominant eigenvectors of the matrices $F$ or $B$ results in three coherent sets, see also Example \ref{['ex:Coherent set illustration']}.
  • Figure 4: (a) Adjacency matrix of a directed graph comprising four clusters. The blocks are colored according to the cluster numbers. (b) Visualization of the clustered graph using the eigenfunctions $\varphi_2$ and $\varphi_3$ as coordinates. (c) Four dominant eigenfunctions $\varphi_\ell$ (top) and $\psi_\ell$ (bottom), where denotes the first, the second, the third, and the fourth eigenfunction. The functions are roughly constant within the clusters. (d) Clusters extracted from the functions $\varphi_\ell$ (top) and $\psi_\ell$ (bottom) using $k$-means. The blue indicator function picks row 3 and column 3 of the block matrix, the red function row 1 and column 2, the yellow function row 2 and column 4, and the purple function row 4 and column 1. That is, each function pair is associated with the block of the adjacency matrix marked in the corresponding color. These blocks all have in common that they contain many nonzero entries (i.e., incoming or outgoing edges). Note that since we compute two functions, i.e., $\varphi_\ell$ and $\psi_\ell$, it is now possible to detect dense off-diagonal blocks.
  • Figure 5: (a) Mean of the eigenvalues estimated from random walk data (averaged over 1000 runs) as a function of $m$, where denotes the first (not shown since it is always 1), the second, the third, and the fourth eigenvalue. The shaded area in the corresponding color represents the standard deviation, the dotted line the infinite-data limit, and the dashed line the eigenvalues of the Galerkin projection of the operator. Recall that the data-driven approximation is a composition of two Galerkin approximations and thus slightly underestimates the correct eigenvalues. The eigenvalues of the Galerkin approximation are also marginally smaller than the eigenvalues of the full operator computed in Example \ref{['ex:DSBM']}. (b) Dominant eigenfunctions $\varphi_\ell$ (top) and $\psi_\ell$ (bottom) of the reduced matrix (solid lines) and the full matrix (dotted lines). The chosen indicator function basis approximates the correct eigenfunctions well since they are essentially almost constant within the clusters.
  • ...and 3 more figures

Theorems & Definitions (35)

  • Definition 2.1: Directed graph
  • Definition 2.2: Directed stochastic block model
  • Definition 2.3: Coherent pair
  • Example 2.4
  • Definition 2.5: Perron--Frobenius & Koopman operators
  • Definition 2.6: Probability density
  • Definition 2.7: Reversibility
  • Definition 2.8: Reweighted Perron--Frobenius operator
  • Definition 2.9: Forward--backward & backward--forward operators
  • Definition 2.10: Covariance operators
  • ...and 25 more