HMSG: Heterogeneous Graph Neural Network based on Metapath Subgraph Learning
Xinjun Cai, Jiaxing Shang, Fei Hao, Dajiang Liu, Linjiang Zheng
TL;DR
This work tackles representation learning on heterogeneous graphs by introducing HMSG, which decomposes a heterogeneous graph into multiple metapath-based subgraphs to capture distinct structural, semantic, and attribute information. It employs type-specific attribute transformations, per-subgraph node aggregations, and subgraph-level attention to fuse information from diverse metapath views into robust node embeddings. Empirical results across node classification, clustering, and link prediction demonstrate state-of-the-art performance over both homogeneous and heterogeneous baselines, underscoring the value of separate but jointly learned subgraphs. The approach offers practical implications for complex, real-world networks and provides a versatile framework for extending heterogeneous graph learning to dynamic settings and alternative subgraph strategies.
Abstract
Many real-world data can be represented as heterogeneous graphs with different types of nodes and connections. Heterogeneous graph neural network model aims to embed nodes or subgraphs into low-dimensional vector space for various downstream tasks such as node classification, link prediction, etc. Although several models were proposed recently, they either only aggregate information from the same type of neighbors, or just indiscriminately treat homogeneous and heterogeneous neighbors in the same way. Based on these observations, we propose a new heterogeneous graph neural network model named HMSG to comprehensively capture structural, semantic and attribute information from both homogeneous and heterogeneous neighbors. Specifically, we first decompose the heterogeneous graph into multiple metapath-based homogeneous and heterogeneous subgraphs, and each subgraph associates specific semantic and structural information. Then message aggregation methods are applied to each subgraph independently, so that information can be learned in a more targeted and efficient manner. Through a type-specific attribute transformation, node attributes can also be transferred among different types of nodes. Finally, we fuse information from subgraphs together to get the complete representation. Extensive experiments on several datasets for node classification, node clustering and link prediction tasks show that HMSG achieves the best performance in all evaluation metrics than state-of-the-art baselines.
