Restructuring Graph for Higher Homophily via Adaptive Spectral Clustering

Shouheng Li; Dongwoo Kim; Qing Wang

Restructuring Graph for Higher Homophily via Adaptive Spectral Clustering

Shouheng Li, Dongwoo Kim, Qing Wang

TL;DR

This work introduces an adaptive spectral-clustering–driven graph restructuring method to boost classical GNNs on less-homophilic graphs. It learns weights for pseudo-eigenvectors to align spectral embeddings with node labels, uses spectrum slicers to avoid full eigendecomposition, and incorporates node features to derive discriminative embeddings. A density-aware homophily metric, h_den, is proposed to robustly assess graph homophily independent of label balance and density, guiding edge rewiring to maximize homophily while controlling sparsity. Empirical results across six real-world and six synthetic datasets show substantial improvements over baseline GNNs, with average gains around 25% and insights into the relationship between homophily and performance. The approach offers a practical, extensible pathway to harness the strengths of homogeneous GNNs on heterophilic graphs, with potential extensions to robustness against over-smoothing and adversarial perturbations.

Abstract

While a growing body of literature has been studying new Graph Neural Networks (GNNs) that work on both homophilic and heterophilic graphs, little has been done on adapting classical GNNs to less-homophilic graphs. Although the ability to handle less-homophilic graphs is restricted, classical GNNs still stand out in several nice properties such as efficiency, simplicity, and explainability. In this work, we propose a novel graph restructuring method that can be integrated into any type of GNNs, including classical GNNs, to leverage the benefits of existing GNNs while alleviating their limitations. Our contribution is threefold: a) learning the weight of pseudo-eigenvectors for an adaptive spectral clustering that aligns well with known node labels, b) proposing a new density-aware homophilic metric that is robust to label imbalance, and c) reconstructing the adjacency matrix based on the result of adaptive spectral clustering to maximize the homophilic scores. The experimental results show that our graph restructuring method can significantly boost the performance of six classical GNNs by an average of 25% on less-homophilic graphs. The boosted performance is comparable to state-of-the-art methods.

Restructuring Graph for Higher Homophily via Adaptive Spectral Clustering

TL;DR

Abstract

Paper Structure (34 sections, 6 theorems, 28 equations, 8 figures, 5 tables, 1 algorithm)

This paper contains 34 sections, 6 theorems, 28 equations, 8 figures, 5 tables, 1 algorithm.

Introduction
Discrepancy between SC and labels
Background
Spectral filtering.
Spectral Clustering (SC) as low-pass filtering.
Adaptive Spectral Clustering
Learning eigenvector coefficients
Eigendecomposition-free SC
Spectrum slicers.
SC with node features
Restructure graphs to maximize homophily
Complexity analysis
A New Homophily Measure
Empirical results
Datasets and models.
...and 19 more sections

Key Result

Proposition 2.1

Let $\epsilon,\beta > 0$ be given. If $\eta$ is larger than: then with probability at least $1-N^{-\beta}$, we have: $\forall (i,j)\in [1, N]^2$,

Figures (8)

Figure 1: Node clusters using different eigenvector choices on Wisconsin and Europe Airport. Colours represent node labels. Coordinates in \ref{['subfig:wisconsin_low_sc']} and \ref{['subfig:europe_low_sc']} are computed using three-dimensional T-SNE. In \ref{['subfig:wisconsin_sc_manual']} and \ref{['subfig:europe_sc_manual']} many nodes are overlapped so they appear to have fewer nodes than they actually have.
Figure 2: Adaptive spectral clustering using spectrum slicers. In \ref{['fig:workflow']}$w_1, ..., w_n$ are adaptive scalar parameters. In \ref{['fig:slicers']}, the spectrum range $[0,2]$ is sliced into equal-width pseudo-eigenvalues by a set of slicers with $s=20$.
Figure 3: Examples of graphs with different label-topology relationships and comparison of different homophily measures. The node colour represents the node labels. The red edges connect nodes of different labels, while the green edges connect nodes of the same labels. Figure \ref{['fig:homo1']} - \ref{['fig:homo3']} shows homophilic graphs of different densities. $h_{\text{den}}$ gives a higher score when a graph is dense, while the other metrics give the same scores. Figure \ref{['fig:hete2']} and \ref{['fig:hete3']} are two graphs that only consist of inter-class edges, but are of different densities. Figure \ref{['fig:imbalance1']} is a label-imbalanced graph. Figure \ref{['fig:regular1']} and Figure \ref{['fig:regular2']} are two regular graphs, where Figure \ref{['fig:regular1']} has an intra-class/inter-class edge ratio of $0.5$, Figure \ref{['fig:regular2']} is an example of an Erdos-Reyi graph sampled with uniform edge probability.
Figure 4: Performance on synthetic datasets.
Figure 5: Homophily and accuracy of GCN on validation and test sets as per edges numbers. The optimal number of edges are chosen based on $h_\text{den}$ on validation set.
...and 3 more figures

Theorems & Definitions (8)

Proposition 2.1: DBLP:conf/icassp/TremblayPBGV16
Lemma 1
Lemma 2
Theorem 7.1: Johnson-Lindenstrauss Theorem
Lemma 2
proof
Lemma 2
proof

Restructuring Graph for Higher Homophily via Adaptive Spectral Clustering

TL;DR

Abstract

Restructuring Graph for Higher Homophily via Adaptive Spectral Clustering

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (8)