Table of Contents
Fetching ...

Hypergraphs as Weighted Directed Self-Looped Graphs: Spectral Properties, Clustering, Cheeger Inequality

Zihao Li, Dongqi Fu, Hengyu Liu, Jingrui He

TL;DR

It is proved that HyperClus-G can always find an approximately linearly optimal partitioning in terms of both NCut and conductance, and it is proved that the normalized hypergraph Laplacian is associated with the NCut value, which inspires the proposed HyperClus-G algorithm for spectral clustering on EDVW hypergraphs.

Abstract

Hypergraphs naturally arise when studying group relations and have been widely used in the field of machine learning. To the best of our knowledge, the recently proposed edge-dependent vertex weights (EDVW) modeling is one of the most generalized modeling methods of hypergraphs, i.e., most existing hypergraph conceptual modeling methods can be generalized as EDVW hypergraphs without information loss. However, the relevant algorithmic developments on EDVW hypergraphs remain nascent: compared to the spectral theories for graphs, its formulations are incomplete, the spectral clustering algorithms are not well-developed, and the hypergraph Cheeger Inequality is not well-defined. To this end, deriving a unified random walk-based formulation, we propose our definitions of hypergraph Rayleigh Quotient, NCut, boundary/cut, volume, and conductance, which are consistent with the corresponding definitions on graphs. Then, we prove that the normalized hypergraph Laplacian is associated with the NCut value, which inspires our proposed HyperClus-G algorithm for spectral clustering on EDVW hypergraphs. Finally, we prove that HyperClus-G can always find an approximately linearly optimal partitioning in terms of both NCut and conductance. Additionally, we provide extensive experiments to validate our theoretical findings from an empirical perspective. Code of HyperClus-G is available at https://github.com/iDEA-iSAIL-Lab-UIUC/HyperClus-G.

Hypergraphs as Weighted Directed Self-Looped Graphs: Spectral Properties, Clustering, Cheeger Inequality

TL;DR

It is proved that HyperClus-G can always find an approximately linearly optimal partitioning in terms of both NCut and conductance, and it is proved that the normalized hypergraph Laplacian is associated with the NCut value, which inspires the proposed HyperClus-G algorithm for spectral clustering on EDVW hypergraphs.

Abstract

Hypergraphs naturally arise when studying group relations and have been widely used in the field of machine learning. To the best of our knowledge, the recently proposed edge-dependent vertex weights (EDVW) modeling is one of the most generalized modeling methods of hypergraphs, i.e., most existing hypergraph conceptual modeling methods can be generalized as EDVW hypergraphs without information loss. However, the relevant algorithmic developments on EDVW hypergraphs remain nascent: compared to the spectral theories for graphs, its formulations are incomplete, the spectral clustering algorithms are not well-developed, and the hypergraph Cheeger Inequality is not well-defined. To this end, deriving a unified random walk-based formulation, we propose our definitions of hypergraph Rayleigh Quotient, NCut, boundary/cut, volume, and conductance, which are consistent with the corresponding definitions on graphs. Then, we prove that the normalized hypergraph Laplacian is associated with the NCut value, which inspires our proposed HyperClus-G algorithm for spectral clustering on EDVW hypergraphs. Finally, we prove that HyperClus-G can always find an approximately linearly optimal partitioning in terms of both NCut and conductance. Additionally, we provide extensive experiments to validate our theoretical findings from an empirical perspective. Code of HyperClus-G is available at https://github.com/iDEA-iSAIL-Lab-UIUC/HyperClus-G.

Paper Structure

This paper contains 37 sections, 12 theorems, 61 equations, 2 figures, 13 tables, 1 algorithm.

Key Result

Theorem 1

(Algebraic connections among hypergraph NCut, Rayleigh Quotient, and Laplacian) Given any hypergraph $\mathcal{H}$ with vertex set $\mathcal{V}$ and hyperedge set $\mathcal{E}$ in the EDVW formatting, i.e., $\mathcal{H} = (\mathcal{V}, \mathcal{E}, \omega, \gamma)$ with positive edge weights $\omega Then,

Figures (2)

  • Figure 1: Undirected graphs $\subset$ EIVW hypergraphs $\subset$ EDVW hypergraphs. Pair-wise edges are naturally hyperedges; each EIVW hypergraph can be reformulated to an EDVW hypergraph by setting each vertex's weight to be the same across hyperedges, yet allowing different vertices to have different weights.
  • Figure 2: Logical flow of our work. "A->B" means A is required for developing B.

Theorems & Definitions (36)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Definition 4
  • Definition 5
  • Definition 6
  • Definition 7
  • Theorem 8
  • Definition 9
  • Definition 10
  • ...and 26 more