Table of Contents
Fetching ...

DiffGraph: Heterogeneous Graph Diffusion Model

Zongwei Li, Lianghao Xia, Hua Hua, Shijie Zhang, Shuangyang Wang, Chao Huang

TL;DR

DiffGraph tackles noise and complex cross-relational semantics in heterogeneous graphs by integrating a cross-view denoising framework with a latent diffusion mechanism operating on encoded representations. The model combines a GCN-based heterogeneous encoder with a learnable denoiser that propagates task-relevant signals from an auxiliary view to a target view, enabling robust link prediction and node classification. Empirical results on diverse public benchmarks and an industrial dataset demonstrate superior accuracy and robustness, with ablations validating the necessity of the diffusion module and cross-view design. The approach offers practical impact for real-world heterogeneous graph tasks and suggests future work on dynamic graphs and broader ethical considerations in data usage.

Abstract

Recent advances in Graph Neural Networks (GNNs) have revolutionized graph-structured data modeling, yet traditional GNNs struggle with complex heterogeneous structures prevalent in real-world scenarios. Despite progress in handling heterogeneous interactions, two fundamental challenges persist: noisy data significantly compromising embedding quality and learning performance, and existing methods' inability to capture intricate semantic transitions among heterogeneous relations, which impacts downstream predictions. To address these fundamental issues, we present the Heterogeneous Graph Diffusion Model (DiffGraph), a pioneering framework that introduces an innovative cross-view denoising strategy. This advanced approach transforms auxiliary heterogeneous data into target semantic spaces, enabling precise distillation of task-relevant information. At its core, DiffGraph features a sophisticated latent heterogeneous graph diffusion mechanism, implementing a novel forward and backward diffusion process for superior noise management. This methodology achieves simultaneous heterogeneous graph denoising and cross-type transition, while significantly simplifying graph generation through its latent-space diffusion capabilities. Through rigorous experimental validation on both public and industrial datasets, we demonstrate that DiffGraph consistently surpasses existing methods in link prediction and node classification tasks, establishing new benchmarks for robustness and efficiency in heterogeneous graph processing. The model implementation is publicly available at: https://github.com/HKUDS/DiffGraph.

DiffGraph: Heterogeneous Graph Diffusion Model

TL;DR

DiffGraph tackles noise and complex cross-relational semantics in heterogeneous graphs by integrating a cross-view denoising framework with a latent diffusion mechanism operating on encoded representations. The model combines a GCN-based heterogeneous encoder with a learnable denoiser that propagates task-relevant signals from an auxiliary view to a target view, enabling robust link prediction and node classification. Empirical results on diverse public benchmarks and an industrial dataset demonstrate superior accuracy and robustness, with ablations validating the necessity of the diffusion module and cross-view design. The approach offers practical impact for real-world heterogeneous graph tasks and suggests future work on dynamic graphs and broader ethical considerations in data usage.

Abstract

Recent advances in Graph Neural Networks (GNNs) have revolutionized graph-structured data modeling, yet traditional GNNs struggle with complex heterogeneous structures prevalent in real-world scenarios. Despite progress in handling heterogeneous interactions, two fundamental challenges persist: noisy data significantly compromising embedding quality and learning performance, and existing methods' inability to capture intricate semantic transitions among heterogeneous relations, which impacts downstream predictions. To address these fundamental issues, we present the Heterogeneous Graph Diffusion Model (DiffGraph), a pioneering framework that introduces an innovative cross-view denoising strategy. This advanced approach transforms auxiliary heterogeneous data into target semantic spaces, enabling precise distillation of task-relevant information. At its core, DiffGraph features a sophisticated latent heterogeneous graph diffusion mechanism, implementing a novel forward and backward diffusion process for superior noise management. This methodology achieves simultaneous heterogeneous graph denoising and cross-type transition, while significantly simplifying graph generation through its latent-space diffusion capabilities. Through rigorous experimental validation on both public and industrial datasets, we demonstrate that DiffGraph consistently surpasses existing methods in link prediction and node classification tasks, establishing new benchmarks for robustness and efficiency in heterogeneous graph processing. The model implementation is publicly available at: https://github.com/HKUDS/DiffGraph.
Paper Structure (26 sections, 14 equations, 6 figures, 3 tables)

This paper contains 26 sections, 14 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Overall architecture of the proposed DiffGraph framework.
  • Figure 2: Ablation study for modules in DiffGraph.
  • Figure 3: Hyperparameter study in terms of Recall@20.
  • Figure 4: Test performance v.s. training epochs.
  • Figure 5: Performance w.r.t. different data sparsity
  • ...and 1 more figures