Table of Contents
Fetching ...

Provably Extending PageRank-based Local Clustering Algorithm to Weighted Directed Graphs with Self-Loops and to Hypergraphs

Zihao Li, Dongqi Fu, Hengyu Liu, Jingrui He

TL;DR

This paper extends the non-approximating Andersen-Chung-Lang ("ACL") algorithm beyond discrete graphs and generalize its quadratic optimality to a wider range of graphs, including weighted, directed, and self-looped graphs and hypergraphs, and proposes two algorithms: GeneralACL for graphs and HyperACL for hypergraphs.

Abstract

Local clustering aims to find a compact cluster near the given starting instances. This work focuses on graph local clustering, which has broad applications beyond graphs because of the internal connectivities within various modalities. While most existing studies on local graph clustering adopt the discrete graph setting (i.e., unweighted graphs without self-loops), real-world graphs can be more complex. In this paper, we extend the non-approximating Andersen-Chung-Lang ("ACL") algorithm beyond discrete graphs and generalize its quadratic optimality to a wider range of graphs, including weighted, directed, and self-looped graphs and hypergraphs. Specifically, leveraging PageRank, we propose two algorithms: GeneralACL for graphs and HyperACL for hypergraphs. We theoretically prove that, under two mild conditions, both algorithms can identify a quadratically optimal local cluster in terms of conductance with at least 1/2 probability. On the property of hypergraphs, we address a fundamental gap in the literature by defining conductance for hypergraphs from the perspective of hypergraph random walks. Additionally, we provide experiments to validate our theoretical findings.

Provably Extending PageRank-based Local Clustering Algorithm to Weighted Directed Graphs with Self-Loops and to Hypergraphs

TL;DR

This paper extends the non-approximating Andersen-Chung-Lang ("ACL") algorithm beyond discrete graphs and generalize its quadratic optimality to a wider range of graphs, including weighted, directed, and self-looped graphs and hypergraphs, and proposes two algorithms: GeneralACL for graphs and HyperACL for hypergraphs.

Abstract

Local clustering aims to find a compact cluster near the given starting instances. This work focuses on graph local clustering, which has broad applications beyond graphs because of the internal connectivities within various modalities. While most existing studies on local graph clustering adopt the discrete graph setting (i.e., unweighted graphs without self-loops), real-world graphs can be more complex. In this paper, we extend the non-approximating Andersen-Chung-Lang ("ACL") algorithm beyond discrete graphs and generalize its quadratic optimality to a wider range of graphs, including weighted, directed, and self-looped graphs and hypergraphs. Specifically, leveraging PageRank, we propose two algorithms: GeneralACL for graphs and HyperACL for hypergraphs. We theoretically prove that, under two mild conditions, both algorithms can identify a quadratically optimal local cluster in terms of conductance with at least 1/2 probability. On the property of hypergraphs, we address a fundamental gap in the literature by defining conductance for hypergraphs from the perspective of hypergraph random walks. Additionally, we provide experiments to validate our theoretical findings.

Paper Structure

This paper contains 32 sections, 20 theorems, 120 equations, 2 figures, 3 tables, 2 algorithms.

Key Result

theorem 1

(For local clustering on graphs) Given any graph in the formatting $\mathcal{G} = (\mathcal{V}, \mathcal{A})$ with non-negative edge weights $\mathcal{A}_{u, v} > 0$ for any $u \in \mathcal{V}, v \in \mathcal{V}$, positive vertex weights $\phi_u(\cdot)$ for any $u \in \mathcal{V}$. For a vertex set

Figures (2)

  • Figure 1: Conductance($\downarrow$) and F1($\uparrow$) Comparison on Local Clustering Task.
  • Figure 2: Time($\downarrow$) Comparison on Local Clustering Task.

Theorems & Definitions (59)

  • theorem 1
  • theorem 2
  • definition 1
  • definition 2
  • definition 3
  • definition 4
  • definition 5
  • definition 6
  • definition 7
  • theorem 5
  • ...and 49 more