Table of Contents
Fetching ...

Enhancing Hyperedge Prediction with Context-Aware Self-Supervised Learning

Yunyong Ko, Hanghang Tong, Sang-Wook Kim

TL;DR

This work tackles hyperedge prediction on hypergraphs by addressing two core challenges: complex node aggregation within hyperedge candidates (C1) and inherent data sparsity (C2). It introduces CASH, a framework that combines context-aware node aggregation with self-supervised contrastive learning, facilitated by hyperedge-aware augmentation and dual node/group contrasts, to produce richer hyperedge representations. The approach employs a two-stage, note-to-edge and edge-to-node hypergraph encoder, yielding context-rich embeddings for hyperedge candidates, and optimizes a unified loss that blends supervised prediction with a dual contrastive objective. Experimental results on six real-world hypergraphs show CASH consistently outperforms state-of-the-art baselines, demonstrates robustness across augmentation parameters, and exhibits near-linear scalability, underscoring practical impact for large-scale group-relations modeling. The work provides code and datasets to support reproducibility and further exploration in hypergraph learning.

Abstract

Hypergraphs can naturally model group-wise relations (e.g., a group of users who co-purchase an item) as hyperedges. Hyperedge prediction is to predict future or unobserved hyperedges, which is a fundamental task in many real-world applications (e.g., group recommendation). Despite the recent breakthrough of hyperedge prediction methods, the following challenges have been rarely studied: (C1) How to aggregate the nodes in each hyperedge candidate for accurate hyperedge prediction? and (C2) How to mitigate the inherent data sparsity problem in hyperedge prediction? To tackle both challenges together, in this paper, we propose a novel hyperedge prediction framework (CASH) that employs (1) context-aware node aggregation to precisely capture complex relations among nodes in each hyperedge for (C1) and (2) self-supervised contrastive learning in the context of hyperedge prediction to enhance hypergraph representations for (C2). Furthermore, as for (C2), we propose a hyperedge-aware augmentation method to fully exploit the latent semantics behind the original hypergraph and consider both node-level and group-level contrasts (i.e., dual contrasts) for better node and hyperedge representations. Extensive experiments on six real-world hypergraphs reveal that CASH consistently outperforms all competing methods in terms of the accuracy in hyperedge prediction and each of the proposed strategies is effective in improving the model accuracy of CASH. For the detailed information of CASH, we provide the code and datasets at: https://github.com/yy-ko/cash.

Enhancing Hyperedge Prediction with Context-Aware Self-Supervised Learning

TL;DR

This work tackles hyperedge prediction on hypergraphs by addressing two core challenges: complex node aggregation within hyperedge candidates (C1) and inherent data sparsity (C2). It introduces CASH, a framework that combines context-aware node aggregation with self-supervised contrastive learning, facilitated by hyperedge-aware augmentation and dual node/group contrasts, to produce richer hyperedge representations. The approach employs a two-stage, note-to-edge and edge-to-node hypergraph encoder, yielding context-rich embeddings for hyperedge candidates, and optimizes a unified loss that blends supervised prediction with a dual contrastive objective. Experimental results on six real-world hypergraphs show CASH consistently outperforms state-of-the-art baselines, demonstrates robustness across augmentation parameters, and exhibits near-linear scalability, underscoring practical impact for large-scale group-relations modeling. The work provides code and datasets to support reproducibility and further exploration in hypergraph learning.

Abstract

Hypergraphs can naturally model group-wise relations (e.g., a group of users who co-purchase an item) as hyperedges. Hyperedge prediction is to predict future or unobserved hyperedges, which is a fundamental task in many real-world applications (e.g., group recommendation). Despite the recent breakthrough of hyperedge prediction methods, the following challenges have been rarely studied: (C1) How to aggregate the nodes in each hyperedge candidate for accurate hyperedge prediction? and (C2) How to mitigate the inherent data sparsity problem in hyperedge prediction? To tackle both challenges together, in this paper, we propose a novel hyperedge prediction framework (CASH) that employs (1) context-aware node aggregation to precisely capture complex relations among nodes in each hyperedge for (C1) and (2) self-supervised contrastive learning in the context of hyperedge prediction to enhance hypergraph representations for (C2). Furthermore, as for (C2), we propose a hyperedge-aware augmentation method to fully exploit the latent semantics behind the original hypergraph and consider both node-level and group-level contrasts (i.e., dual contrasts) for better node and hyperedge representations. Extensive experiments on six real-world hypergraphs reveal that CASH consistently outperforms all competing methods in terms of the accuracy in hyperedge prediction and each of the proposed strategies is effective in improving the model accuracy of CASH. For the detailed information of CASH, we provide the code and datasets at: https://github.com/yy-ko/cash.
Paper Structure (22 sections, 7 equations, 7 figures, 4 tables, 1 algorithm)

This paper contains 22 sections, 7 equations, 7 figures, 4 tables, 1 algorithm.

Figures (7)

  • Figure 1: Group-wise relations in e-commerce networks modeled as (a) a graph and (b) a hypergraph, where each hyperedge represents an item co-purchased by a group of users.
  • Figure 2: The overview of CASH: (1) Context-aware hyperedge prediction (upper) and (2) Self-supervised contrative hypergraph learning (lower).
  • Figure 3: Comparison of (a) random membership masking with (b) our hyperedge-aware membership masking.
  • Figure 4: The impact of the dual contrative learning on the hyperedge prediction accuracy of CASH according to the control hyperparameter $\beta$. The auxiliary task is consistently beneficial to hyperedge prediction across a wide range of $\beta$ values.
  • Figure 5: The hyperparameter sensitivity of CASH to the membership and node feature masking rates $p_{m}$ and $p_{f}$. CASH achieves high accuracy with a wide range of $p_{m}$ and $p_{f}$ values (i.e., the blue wide area on the surface).
  • ...and 2 more figures