Table of Contents
Fetching ...

Expander Hierarchies for Normalized Cuts on Graphs

Kathrin Hanauer, Monika Henzinger, Robin Münk, Harald Räcke, Maximilian Vötsch

TL;DR

This work addresses the practical computation of expander decompositions and their hierarchies to enhance normalized-cut graph clustering. It introduces a random-walk-based expander decomposition and the XCut pipeline, which builds a tree-flow sparsifier (expander hierarchy) and solves the normalized-cut problem with refinement as it descends the hierarchy. The approach delivers superior solution quality on diverse graph families (e.g., citation, email, social networks, web graphs) and remains competitive in runtime, while enabling efficient re-use of a sparsifier to handle multiple values of k. The results demonstrate a significant practical impact by bringing expander-hierarchy techniques into scalable, high-quality graph clustering, with open-source software and strong potential for extension to other cut problems.

Abstract

Expander decompositions of graphs have significantly advanced the understanding of many classical graph problems and led to numerous fundamental theoretical results. However, their adoption in practice has been hindered due to their inherent intricacies and large hidden factors in their asymptotic running times. Here, we introduce the first practically efficient algorithm for computing expander decompositions and their hierarchies and demonstrate its effectiveness and utility by incorporating it as the core component in a novel solver for the normalized cut graph clustering objective. Our extensive experiments on a variety of large graphs show that our expander-based algorithm outperforms state-of-the-art solvers for normalized cut with respect to solution quality by a large margin on a variety of graph classes such as citation, e-mail, and social networks or web graphs while remaining competitive in running time.

Expander Hierarchies for Normalized Cuts on Graphs

TL;DR

This work addresses the practical computation of expander decompositions and their hierarchies to enhance normalized-cut graph clustering. It introduces a random-walk-based expander decomposition and the XCut pipeline, which builds a tree-flow sparsifier (expander hierarchy) and solves the normalized-cut problem with refinement as it descends the hierarchy. The approach delivers superior solution quality on diverse graph families (e.g., citation, email, social networks, web graphs) and remains competitive in runtime, while enabling efficient re-use of a sparsifier to handle multiple values of k. The results demonstrate a significant practical impact by bringing expander-hierarchy techniques into scalable, high-quality graph clustering, with open-source software and strong potential for extension to other cut problems.

Abstract

Expander decompositions of graphs have significantly advanced the understanding of many classical graph problems and led to numerous fundamental theoretical results. However, their adoption in practice has been hindered due to their inherent intricacies and large hidden factors in their asymptotic running times. Here, we introduce the first practically efficient algorithm for computing expander decompositions and their hierarchies and demonstrate its effectiveness and utility by incorporating it as the core component in a novel solver for the normalized cut graph clustering objective. Our extensive experiments on a variety of large graphs show that our expander-based algorithm outperforms state-of-the-art solvers for normalized cut with respect to solution quality by a large margin on a variety of graph classes such as citation, e-mail, and social networks or web graphs while remaining competitive in running time.
Paper Structure (25 sections, 11 theorems, 7 equations, 11 figures, 5 tables, 1 algorithm)

This paper contains 25 sections, 11 theorems, 7 equations, 11 figures, 5 tables, 1 algorithm.

Key Result

Theorem 1

Given a graph $G$ with $m$ edges and a parameter $\phi$, there is a random-walk-based algorithm that with high probability finds a $\phi$-expander decomposition of $G$ and cuts at most $\mathcal{O} (\sqrt{\phi}\, m \log^{5/2}{m} )$ edges. The running time is $\mathcal{O} ( \frac{m+n\log{n}}{\phi}\lo

Figures (11)

  • Figure 1: Relative improvement over Graclus (y-axis) vs. the ratio of the maximum degree and the median degree (x-axis). The value on the x-axis is larger if the graphs exhibit a distribution with large outliers. Note the negative trend, except for the outlier towards the right corresponding to instance SN7.
  • Figure 2: Running time vs. normalized cut for different choices of $\rho$ on two different graphs for $k=16$. Colors denote different levels of $\rho$, while shapes indicate the graph.
  • Figure 3: Geometric mean of the cut value $\theta$ across all graphs for each $k$ for $\textsf{XCut}_\textsf{mean}$, Graclus, METIS, and KaHiP.
  • Figure 4: Relative improvement over Graclus (y-axis) vs. the average degree divided by the median degree (x-axis). Larger x-values signify that the graph exhibits a skewed, power-law-like degree distribution. Note the negative trend, except for the outlier towards the right corresponding to instance SN7.
  • Figure 5: Percentage deviation of the returned normalized cut value relative to Graclus for $k=32$. This means that a value of -75% indicates that the normalized cut value is 75% lower (i.e., better). The thin black bars indicate the standard error across our runs. The top graph shows the disconnected IMDB graph (BP1), citation network instances (CN), email networks (EM), infrastructure graphs (IF), social networks (SN), and web graphs (WB), while the bottom shows the remaining instances. See the full version of the paper for details.
  • ...and 6 more figures

Theorems & Definitions (16)

  • Theorem 1: Expander Decomposition
  • Theorem 2: Cut Procedure
  • Definition 1: Near $\phi$-expander
  • Lemma 1
  • Claim 1
  • Lemma 2
  • Lemma 3: Projection Lemma
  • Definition 2
  • Claim 2
  • Lemma 4
  • ...and 6 more