Table of Contents
Fetching ...

HMSG: Heterogeneous Graph Neural Network based on Metapath Subgraph Learning

Xinjun Cai, Jiaxing Shang, Fei Hao, Dajiang Liu, Linjiang Zheng

TL;DR

This work tackles representation learning on heterogeneous graphs by introducing HMSG, which decomposes a heterogeneous graph into multiple metapath-based subgraphs to capture distinct structural, semantic, and attribute information. It employs type-specific attribute transformations, per-subgraph node aggregations, and subgraph-level attention to fuse information from diverse metapath views into robust node embeddings. Empirical results across node classification, clustering, and link prediction demonstrate state-of-the-art performance over both homogeneous and heterogeneous baselines, underscoring the value of separate but jointly learned subgraphs. The approach offers practical implications for complex, real-world networks and provides a versatile framework for extending heterogeneous graph learning to dynamic settings and alternative subgraph strategies.

Abstract

Many real-world data can be represented as heterogeneous graphs with different types of nodes and connections. Heterogeneous graph neural network model aims to embed nodes or subgraphs into low-dimensional vector space for various downstream tasks such as node classification, link prediction, etc. Although several models were proposed recently, they either only aggregate information from the same type of neighbors, or just indiscriminately treat homogeneous and heterogeneous neighbors in the same way. Based on these observations, we propose a new heterogeneous graph neural network model named HMSG to comprehensively capture structural, semantic and attribute information from both homogeneous and heterogeneous neighbors. Specifically, we first decompose the heterogeneous graph into multiple metapath-based homogeneous and heterogeneous subgraphs, and each subgraph associates specific semantic and structural information. Then message aggregation methods are applied to each subgraph independently, so that information can be learned in a more targeted and efficient manner. Through a type-specific attribute transformation, node attributes can also be transferred among different types of nodes. Finally, we fuse information from subgraphs together to get the complete representation. Extensive experiments on several datasets for node classification, node clustering and link prediction tasks show that HMSG achieves the best performance in all evaluation metrics than state-of-the-art baselines.

HMSG: Heterogeneous Graph Neural Network based on Metapath Subgraph Learning

TL;DR

This work tackles representation learning on heterogeneous graphs by introducing HMSG, which decomposes a heterogeneous graph into multiple metapath-based subgraphs to capture distinct structural, semantic, and attribute information. It employs type-specific attribute transformations, per-subgraph node aggregations, and subgraph-level attention to fuse information from diverse metapath views into robust node embeddings. Empirical results across node classification, clustering, and link prediction demonstrate state-of-the-art performance over both homogeneous and heterogeneous baselines, underscoring the value of separate but jointly learned subgraphs. The approach offers practical implications for complex, real-world networks and provides a versatile framework for extending heterogeneous graph learning to dynamic settings and alternative subgraph strategies.

Abstract

Many real-world data can be represented as heterogeneous graphs with different types of nodes and connections. Heterogeneous graph neural network model aims to embed nodes or subgraphs into low-dimensional vector space for various downstream tasks such as node classification, link prediction, etc. Although several models were proposed recently, they either only aggregate information from the same type of neighbors, or just indiscriminately treat homogeneous and heterogeneous neighbors in the same way. Based on these observations, we propose a new heterogeneous graph neural network model named HMSG to comprehensively capture structural, semantic and attribute information from both homogeneous and heterogeneous neighbors. Specifically, we first decompose the heterogeneous graph into multiple metapath-based homogeneous and heterogeneous subgraphs, and each subgraph associates specific semantic and structural information. Then message aggregation methods are applied to each subgraph independently, so that information can be learned in a more targeted and efficient manner. Through a type-specific attribute transformation, node attributes can also be transferred among different types of nodes. Finally, we fuse information from subgraphs together to get the complete representation. Extensive experiments on several datasets for node classification, node clustering and link prediction tasks show that HMSG achieves the best performance in all evaluation metrics than state-of-the-art baselines.

Paper Structure

This paper contains 19 sections, 15 equations, 2 figures, 6 tables, 1 algorithm.

Figures (2)

  • Figure 1: (a) An example heterogeneous graph with three types of nodes (i.e., authors, papers, venues). (b) Four examples of metapath: Paper-Author-Paper (PAP), Paper-Venue-Paper (PVP), Paper-Author (PA) and Paper-Venue (PV). (c) The metapath-based homogeneous graph and heterogeneous graph, respectively.
  • Figure 2: The overall architecture of HMSG. (a) Metapath-based subgraph generation; (b) Node aggregation within graphs; (c) Subgraph aggregation; (d) Loss function to be optimized.

Theorems & Definitions (4)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4