Enhancing Hyperedge Prediction with Context-Aware Self-Supervised Learning
Yunyong Ko, Hanghang Tong, Sang-Wook Kim
TL;DR
This work tackles hyperedge prediction on hypergraphs by addressing two core challenges: complex node aggregation within hyperedge candidates (C1) and inherent data sparsity (C2). It introduces CASH, a framework that combines context-aware node aggregation with self-supervised contrastive learning, facilitated by hyperedge-aware augmentation and dual node/group contrasts, to produce richer hyperedge representations. The approach employs a two-stage, note-to-edge and edge-to-node hypergraph encoder, yielding context-rich embeddings for hyperedge candidates, and optimizes a unified loss that blends supervised prediction with a dual contrastive objective. Experimental results on six real-world hypergraphs show CASH consistently outperforms state-of-the-art baselines, demonstrates robustness across augmentation parameters, and exhibits near-linear scalability, underscoring practical impact for large-scale group-relations modeling. The work provides code and datasets to support reproducibility and further exploration in hypergraph learning.
Abstract
Hypergraphs can naturally model group-wise relations (e.g., a group of users who co-purchase an item) as hyperedges. Hyperedge prediction is to predict future or unobserved hyperedges, which is a fundamental task in many real-world applications (e.g., group recommendation). Despite the recent breakthrough of hyperedge prediction methods, the following challenges have been rarely studied: (C1) How to aggregate the nodes in each hyperedge candidate for accurate hyperedge prediction? and (C2) How to mitigate the inherent data sparsity problem in hyperedge prediction? To tackle both challenges together, in this paper, we propose a novel hyperedge prediction framework (CASH) that employs (1) context-aware node aggregation to precisely capture complex relations among nodes in each hyperedge for (C1) and (2) self-supervised contrastive learning in the context of hyperedge prediction to enhance hypergraph representations for (C2). Furthermore, as for (C2), we propose a hyperedge-aware augmentation method to fully exploit the latent semantics behind the original hypergraph and consider both node-level and group-level contrasts (i.e., dual contrasts) for better node and hyperedge representations. Extensive experiments on six real-world hypergraphs reveal that CASH consistently outperforms all competing methods in terms of the accuracy in hyperedge prediction and each of the proposed strategies is effective in improving the model accuracy of CASH. For the detailed information of CASH, we provide the code and datasets at: https://github.com/yy-ko/cash.
