Table of Contents
Fetching ...

H$^3$GNNs: Harmonizing Heterophily and Homophily in GNNs via Joint Structural Node Encoding and Self-Supervised Learning

Rui Xue, Tianfu Wu

TL;DR

H$^3$GNNs address the dual challenge of heterophily and homophily in graph neural networks by marrying a joint structural node encoding with a self-supervised, teacher-student framework. The model blends linear and nonlinear feature projections with $K$-hop structural embeddings through Weighted GCNs and cross-attention, while a dynamic masking strategy guided by node difficulty drives robust representation learning. Theoretical guarantees show faster convergence than encoder-decoder SSL, and empirical results across seven benchmarks demonstrate state-of-the-art performance on heterophilic graphs and competitive results on homophilic graphs, all with efficient compute and memory footprints. These insights offer a scalable, generalizable approach to graph SSL that excels across mixed structural properties.

Abstract

Graph Neural Networks (GNNs) struggle to balance heterophily and homophily in representation learning, a challenge further amplified in self-supervised settings. We propose H$^3$GNNs, an end-to-end self-supervised learning framework that harmonizes both structural properties through two key innovations: (i) Joint Structural Node Encoding. We embed nodes into a unified space combining linear and non-linear feature projections with K-hop structural representations via a Weighted Graph Convolution Network(WGCN). A cross-attention mechanism enhances awareness and adaptability to heterophily and homophily. (ii) Self-Supervised Learning Using Teacher-Student Predictive Architectures with Node-Difficulty Driven Dynamic Masking Strategies. We use a teacher-student model, the student sees the masked input graph and predicts node features inferred by the teacher that sees the full input graph in the joint encoding space. To enhance learning difficulty, we introduce two novel node-predictive-difficulty-based masking strategies. Experiments on seven benchmarks (four heterophily datasets and three homophily datasets) confirm the effectiveness and efficiency of H$^3$GNNs across diverse graph types. Our H$^3$GNNs achieves overall state-of-the-art performance on the four heterophily datasets, while retaining on-par performance to previous state-of-the-art methods on the three homophily datasets.

H$^3$GNNs: Harmonizing Heterophily and Homophily in GNNs via Joint Structural Node Encoding and Self-Supervised Learning

TL;DR

HGNNs address the dual challenge of heterophily and homophily in graph neural networks by marrying a joint structural node encoding with a self-supervised, teacher-student framework. The model blends linear and nonlinear feature projections with -hop structural embeddings through Weighted GCNs and cross-attention, while a dynamic masking strategy guided by node difficulty drives robust representation learning. Theoretical guarantees show faster convergence than encoder-decoder SSL, and empirical results across seven benchmarks demonstrate state-of-the-art performance on heterophilic graphs and competitive results on homophilic graphs, all with efficient compute and memory footprints. These insights offer a scalable, generalizable approach to graph SSL that excels across mixed structural properties.

Abstract

Graph Neural Networks (GNNs) struggle to balance heterophily and homophily in representation learning, a challenge further amplified in self-supervised settings. We propose HGNNs, an end-to-end self-supervised learning framework that harmonizes both structural properties through two key innovations: (i) Joint Structural Node Encoding. We embed nodes into a unified space combining linear and non-linear feature projections with K-hop structural representations via a Weighted Graph Convolution Network(WGCN). A cross-attention mechanism enhances awareness and adaptability to heterophily and homophily. (ii) Self-Supervised Learning Using Teacher-Student Predictive Architectures with Node-Difficulty Driven Dynamic Masking Strategies. We use a teacher-student model, the student sees the masked input graph and predicts node features inferred by the teacher that sees the full input graph in the joint encoding space. To enhance learning difficulty, we introduce two novel node-predictive-difficulty-based masking strategies. Experiments on seven benchmarks (four heterophily datasets and three homophily datasets) confirm the effectiveness and efficiency of HGNNs across diverse graph types. Our HGNNs achieves overall state-of-the-art performance on the four heterophily datasets, while retaining on-par performance to previous state-of-the-art methods on the three homophily datasets.

Paper Structure

This paper contains 40 sections, 4 theorems, 13 equations, 6 figures, 8 tables.

Key Result

theorem 1

Consider the optimization of encoder-decoder based graph SSL in Eqn. eq:mae and our proposed H$^3$GNNs in Eqn. eq:loss under the same encoder architecture and following assumptions/conditions: Then, the following three results hold:

Figures (6)

  • Figure 1: Illustration of our proposed H$^3$GNNs , a simple yet effective framework for SSL from graphs of mixed structural properties (heterophily and homophily). In the right-bottom, we show the t-SNE visualization on the Texas dataset using the raw node features and the features learned by our H$^3$GNNs , which shows the effectiveness of our proposed method. Best viewed in color. See text for details.
  • Figure 2: Performance comparison across all datasets
  • Figure 3: T-SNE visualizations of Wisconsin datasets.
  • Figure 4: T-SNE visualizations of Texas datasets.
  • Figure 5: T-SNE visualizations of Cornell datasets.
  • ...and 1 more figures

Theorems & Definitions (4)

  • theorem 1
  • theorem 2
  • theorem 3
  • theorem 4