Hypergraph Self-supervised Learning with Sampling-efficient Signals
Fan Li, Xiaoyang Wang, Dawei Cheng, Wenjie Zhang, Ying Zhang, Xuemin Lin
TL;DR
This paper tackles the inefficiencies and bias of existing hypergraph self-supervised learning by introducing SE-HSSL, a sampling-efficient framework that uses three signals: node-level and group-level CCA objectives, which are sampling-free, and a hierarchical membership-level contrast to exploit overlap structure. The approach relies on a shared HGNN encoder and two augmented views generated via node feature masking and membership masking, optimizing a joint objective that blends invariance and decorrelation terms while avoiding大量 negative sampling. Empirical results across seven real-world hypergraphs show SE-HSSL achieves state-of-the-art or competitive performance in node classification and clustering, with substantial training speedups (at least 2x, often more) compared to the current SOTA TriCL. The work advances high-order hypergraph representation learning by reducing sampling bias and computational cost, enabling scalable SSL for complex hypergraph data with strong downstream utility.
Abstract
Self-supervised learning (SSL) provides a promising alternative for representation learning on hypergraphs without costly labels. However, existing hypergraph SSL models are mostly based on contrastive methods with the instance-level discrimination strategy, suffering from two significant limitations: (1) They select negative samples arbitrarily, which is unreliable in deciding similar and dissimilar pairs, causing training bias. (2) They often require a large number of negative samples, resulting in expensive computational costs. To address the above issues, we propose SE-HSSL, a hypergraph SSL framework with three sampling-efficient self-supervised signals. Specifically, we introduce two sampling-free objectives leveraging the canonical correlation analysis as the node-level and group-level self-supervised signals. Additionally, we develop a novel hierarchical membership-level contrast objective motivated by the cascading overlap relationship in hypergraphs, which can further reduce membership sampling bias and improve the efficiency of sample utilization. Through comprehensive experiments on 7 real-world hypergraphs, we demonstrate the superiority of our approach over the state-of-the-art method in terms of both effectiveness and efficiency.
