Table of Contents
Fetching ...

Provable Filter for Real-world Graph Clustering

Xuanting Xie, Erlin Pan, Zhao Kang, Wenyu Chen, Bingheng Li

TL;DR

This work constructs two graphs that are highly homophilic and heterophilic, respectively, and finds that most homophilic and heterophilic edges can be correctly identified on the basis of neighbor information.

Abstract

Graph clustering, an important unsupervised problem, has been shown to be more resistant to advances in Graph Neural Networks (GNNs). In addition, almost all clustering methods focus on homophilic graphs and ignore heterophily. This significantly limits their applicability in practice, since real-world graphs exhibit a structural disparity and cannot simply be classified as homophily and heterophily. Thus, a principled way to handle practical graphs is urgently needed. To fill this gap, we provide a novel solution with theoretical support. Interestingly, we find that most homophilic and heterophilic edges can be correctly identified on the basis of neighbor information. Motivated by this finding, we construct two graphs that are highly homophilic and heterophilic, respectively. They are used to build low-pass and high-pass filters to capture holistic information. Important features are further enhanced by the squeeze-and-excitation block. We validate our approach through extensive experiments on both homophilic and heterophilic graphs. Empirical results demonstrate the superiority of our method compared to state-of-the-art clustering methods.

Provable Filter for Real-world Graph Clustering

TL;DR

This work constructs two graphs that are highly homophilic and heterophilic, respectively, and finds that most homophilic and heterophilic edges can be correctly identified on the basis of neighbor information.

Abstract

Graph clustering, an important unsupervised problem, has been shown to be more resistant to advances in Graph Neural Networks (GNNs). In addition, almost all clustering methods focus on homophilic graphs and ignore heterophily. This significantly limits their applicability in practice, since real-world graphs exhibit a structural disparity and cannot simply be classified as homophily and heterophily. Thus, a principled way to handle practical graphs is urgently needed. To fill this gap, we provide a novel solution with theoretical support. Interestingly, we find that most homophilic and heterophilic edges can be correctly identified on the basis of neighbor information. Motivated by this finding, we construct two graphs that are highly homophilic and heterophilic, respectively. They are used to build low-pass and high-pass filters to capture holistic information. Important features are further enhanced by the squeeze-and-excitation block. We validate our approach through extensive experiments on both homophilic and heterophilic graphs. Empirical results demonstrate the superiority of our method compared to state-of-the-art clustering methods.
Paper Structure (23 sections, 1 theorem, 33 equations, 5 figures, 5 tables)

This paper contains 23 sections, 1 theorem, 33 equations, 5 figures, 5 tables.

Key Result

Theorem 3.1

Assume that low-pass and high-pass filters are applied on the graphs with $r>\frac{1}{C}$ and $r<\frac{1}{C}$, respectively. Then the clusters would be more discriminative with $h_1(L)$ compared to $h_2(L)$, while $h_4(L)$ improves the discriminativeness of the clusters more than $h_3(L)$.

Figures (5)

  • Figure 1: An interesting observation: most homophilic and heterophilic edges can be correctly identified by neighbor information.
  • Figure 2: The overall architecture of our proposed method.
  • Figure 3: The effect of $k$ and $\mu$ on Cora and Washington with restructured graphs and raw graphs.
  • Figure 4: The effect of $\gamma_1$ and $\gamma_2$ on Cora (left) and Washington (right).
  • Figure 5: The results with masked features.

Theorems & Definitions (2)

  • Theorem 3.1
  • proof