Refining Latent Representations: A Generative SSL Approach for Heterogeneous Graph Learning
Yulan Hu, Zhirui Yang, Sheng Ouyang, Yong Liu
TL;DR
The paper tackles heterogeneous graph learning (HGL) by applying generative self-supervised learning to refine latent representations. It introduces HGVAE, a variational graph autoencoder that combines latent-space contrastive learning with a progressive negative sampling strategy (PNSG) and a robust ESCE reconstruction objective. Across multiple real-world datasets, HGVAE achieves state-of-the-art performance on node classification and competitive results on clustering, with ablations confirming the crucial roles of latent-contrastive learning and PNSG. This work demonstrates the viability and benefits of latent-representation refinement through generative SSL in HGL, offering a principled pathway for future research in heterogeneous graphs.
Abstract
Self-Supervised Learning (SSL) has shown significant potential and has garnered increasing interest in graph learning. However, particularly for generative SSL methods, its potential in Heterogeneous Graph Learning (HGL) remains relatively underexplored. Generative SSL utilizes an encoder to map the input graph into a latent representation and a decoder to recover the input graph from the latent representation. Previous HGL SSL methods generally design complex strategies to capture graph heterogeneity, which heavily rely on contrastive view construction strategies that are often non-trivial. Yet, refining the latent representation in generative SSL can effectively improve graph learning results. In this study, we propose HGVAE, a generative SSL method specially designed for HGL. Instead of focusing on designing complex strategies to capture heterogeneity, HGVAE centers on refining the latent representation. Specifically, HGVAE innovatively develops a contrastive task based on the latent representation. To ensure the hardness of negative samples, we develop a progressive negative sample generation (PNSG) mechanism that leverages the ability of Variational Inference (VI) to generate high-quality negative samples. As a pioneer in applying generative SSL for HGL, HGVAE refines the latent representation, thereby compelling the model to learn high-quality representations. Compared with various state-of-the-art (SOTA) baselines, HGVAE achieves impressive results, thus validating its superiority.
