Table of Contents
Fetching ...

When Heterophily Meets Heterogeneous Graphs: Latent Graphs Guided Unsupervised Representation Learning

Zhixiang Shen, Zhao Kang

TL;DR

Semantic heterophily in unsupervised heterogeneous graph learning is prevalent and under-addressed. The paper introduces LatGRL, a framework that builds homophilic and heterophilic latent graphs by coupling global structure and feature similarity, and applies adaptive dual-frequency semantic fusion to capture node-level heterophily. LatGRL optimizes mutual information between fused representations and latent graphs (via InfoNCE) and includes a scalable variant LatGRL-S for large graphs, validated on four publicheterogeneous datasets and ogbn-mag with strong node classification and clustering performance. The approach provides quantitative metrics for semantic homophily (MHR,NHR) and demonstrates that latent graphs offer supervision beyond traditional contrastive views, enabling robust unsupervised learning under semantic heterophily.

Abstract

Unsupervised heterogeneous graph representation learning (UHGRL) has gained increasing attention due to its significance in handling practical graphs without labels. However, heterophily has been largely ignored, despite its ubiquitous presence in real-world heterogeneous graphs. In this paper, we define semantic heterophily and propose an innovative framework called Latent Graphs Guided Unsupervised Representation Learning (LatGRL) to handle this problem. First, we develop a similarity mining method that couples global structures and attributes, enabling the construction of fine-grained homophilic and heterophilic latent graphs to guide the representation learning. Moreover, we propose an adaptive dual-frequency semantic fusion mechanism to address the problem of node-level semantic heterophily. To cope with the massive scale of real-world data, we further design a scalable implementation. Extensive experiments on benchmark datasets validate the effectiveness and efficiency of our proposed framework. The source code and datasets have been made available at https://github.com/zxlearningdeep/LatGRL.

When Heterophily Meets Heterogeneous Graphs: Latent Graphs Guided Unsupervised Representation Learning

TL;DR

Semantic heterophily in unsupervised heterogeneous graph learning is prevalent and under-addressed. The paper introduces LatGRL, a framework that builds homophilic and heterophilic latent graphs by coupling global structure and feature similarity, and applies adaptive dual-frequency semantic fusion to capture node-level heterophily. LatGRL optimizes mutual information between fused representations and latent graphs (via InfoNCE) and includes a scalable variant LatGRL-S for large graphs, validated on four publicheterogeneous datasets and ogbn-mag with strong node classification and clustering performance. The approach provides quantitative metrics for semantic homophily (MHR,NHR) and demonstrates that latent graphs offer supervision beyond traditional contrastive views, enabling robust unsupervised learning under semantic heterophily.

Abstract

Unsupervised heterogeneous graph representation learning (UHGRL) has gained increasing attention due to its significance in handling practical graphs without labels. However, heterophily has been largely ignored, despite its ubiquitous presence in real-world heterogeneous graphs. In this paper, we define semantic heterophily and propose an innovative framework called Latent Graphs Guided Unsupervised Representation Learning (LatGRL) to handle this problem. First, we develop a similarity mining method that couples global structures and attributes, enabling the construction of fine-grained homophilic and heterophilic latent graphs to guide the representation learning. Moreover, we propose an adaptive dual-frequency semantic fusion mechanism to address the problem of node-level semantic heterophily. To cope with the massive scale of real-world data, we further design a scalable implementation. Extensive experiments on benchmark datasets validate the effectiveness and efficiency of our proposed framework. The source code and datasets have been made available at https://github.com/zxlearningdeep/LatGRL.
Paper Structure (24 sections, 31 equations, 7 figures, 10 tables, 1 algorithm)

This paper contains 24 sections, 31 equations, 7 figures, 10 tables, 1 algorithm.

Figures (7)

  • Figure 1: An example of semantic heterophily. The anchor node has distinct node-level semantic homophily ratios (NHR) across different meta-paths.
  • Figure 2: Node-level semantic homophily ratio distributions. Each real-world heterogeneous graph has diverse node neighborhood patterns.
  • Figure 3: Illustration for our proposed framework LatGRL. It uses the coupled similarity measurement to construct a duo of latent graphs, guiding the representation learning. Additionally, it employs dual-pass graph filtering for node-wise adaptive fusion to tackle the challenges posed by semantic heterophily in heterogeneous graphs.
  • Figure 4: Visualization of the learned node representation on ACM. The corresponding Silhouette scores are also given.
  • Figure 5: The experimental results on ogbn-mag.
  • ...and 2 more figures