Table of Contents
Fetching ...

CHGNN: A Semi-Supervised Contrastive Hypergraph Learning Network

Yumeng Song, Yu Gu, Tianyi Li, Jianzhong Qi, Zhenghao Liu, Christian S. Jensen, Ge Yu

TL;DR

CHGNN tackles semi-supervised node classification on hypergraphs by uniting contrastive self-supervision with label information. It introduces an adaptive hypergraph view generator to create diverse, informative hypergraph views and a hyperedge homogeneity-aware HyperGNN to preserve higher-order dependencies via $homo(e)$. The training objective combines a similarity loss for the views, a supervised classification loss, a hyperedge homogeneity loss, and both basic and cross-validation contrastive losses, with an enhanced strategy that adaptively distances negative samples. Experiments on nine real-world hypergraph datasets show CHGNN consistently outperforms 13–19 competitive methods, demonstrating robustness under low-label scenarios and complex higher-order relationships. This approach highlights the value of leveraging hyperedge semantics and cross-type contrasts for scalable, accurate hypergraph learning.

Abstract

Hypergraphs can model higher-order relationships among data objects that are found in applications such as social networks and bioinformatics. However, recent studies on hypergraph learning that extend graph convolutional networks to hypergraphs cannot learn effectively from features of unlabeled data. To such learning, we propose a contrastive hypergraph neural network, CHGNN, that exploits self-supervised contrastive learning techniques to learn from labeled and unlabeled data. First, CHGNN includes an adaptive hypergraph view generator that adopts an auto-augmentation strategy and learns a perturbed probability distribution of minimal sufficient views. Second, CHGNN encompasses an improved hypergraph encoder that considers hyperedge homogeneity to fuse information effectively. Third, CHGNN is equipped with a joint loss function that combines a similarity loss for the view generator, a node classification loss, and a hyperedge homogeneity loss to inject supervision signals. It also includes basic and cross-validation contrastive losses, associated with an enhanced contrastive loss training process. Experimental results on nine real datasets offer insight into the effectiveness of CHGNN, showing that it outperforms 13 competitors in terms of classification accuracy consistently.

CHGNN: A Semi-Supervised Contrastive Hypergraph Learning Network

TL;DR

CHGNN tackles semi-supervised node classification on hypergraphs by uniting contrastive self-supervision with label information. It introduces an adaptive hypergraph view generator to create diverse, informative hypergraph views and a hyperedge homogeneity-aware HyperGNN to preserve higher-order dependencies via . The training objective combines a similarity loss for the views, a supervised classification loss, a hyperedge homogeneity loss, and both basic and cross-validation contrastive losses, with an enhanced strategy that adaptively distances negative samples. Experiments on nine real-world hypergraph datasets show CHGNN consistently outperforms 13–19 competitive methods, demonstrating robustness under low-label scenarios and complex higher-order relationships. This approach highlights the value of leveraging hyperedge semantics and cross-type contrasts for scalable, accurate hypergraph learning.

Abstract

Hypergraphs can model higher-order relationships among data objects that are found in applications such as social networks and bioinformatics. However, recent studies on hypergraph learning that extend graph convolutional networks to hypergraphs cannot learn effectively from features of unlabeled data. To such learning, we propose a contrastive hypergraph neural network, CHGNN, that exploits self-supervised contrastive learning techniques to learn from labeled and unlabeled data. First, CHGNN includes an adaptive hypergraph view generator that adopts an auto-augmentation strategy and learns a perturbed probability distribution of minimal sufficient views. Second, CHGNN encompasses an improved hypergraph encoder that considers hyperedge homogeneity to fuse information effectively. Third, CHGNN is equipped with a joint loss function that combines a similarity loss for the view generator, a node classification loss, and a hyperedge homogeneity loss to inject supervision signals. It also includes basic and cross-validation contrastive losses, associated with an enhanced contrastive loss training process. Experimental results on nine real datasets offer insight into the effectiveness of CHGNN, showing that it outperforms 13 competitors in terms of classification accuracy consistently.
Paper Structure (32 sections, 1 theorem, 27 equations, 6 figures, 5 tables)

This paper contains 32 sections, 1 theorem, 27 equations, 6 figures, 5 tables.

Key Result

Lemma 1

Given a positive sample $p$, a training instance $q$, and a negative sample set $N=\{n_1, n_2\}$ with $s_z(q, n_{1})> s_z(q, n_{2})$ then we have $r_{n_1} > r_{n_2}$ if $\tau_{n_1}=\tau^{ub}/s_z(q,n_1)$ and $\tau_{n_2}=\tau^{ub}/s_z(q,n_2)$, where $\tau^{ub}$ is the upper bound of the temperature pa

Figures (6)

  • Figure 1: Comparison of HyperGNN and CHGNN learning, where (i) colored, solid circles denote nodes with class labels; (ii) colored, dashed circles denote nodes with model-inferred classes; (iii) uncolored circles denote nodes with model-inferred classes but that are not involved in the loss calculation; (iv) dashed regions denote hyperedges; (v) red and green arrows denote class label information propagation; and (vi) orange arrows denote contrastive information propagation.
  • Figure 2: CHGNN model overview (best viewed in color), where $\mathcal{C}(\cdot)$ is the classifier and $\mathcal{R}(\cdot)$ is the regressor. The input is a hypergraph $G$. The output is node embedding matrices $H_v^i$ and hyperedge embedding matrices $H_e^i$ of augmented hypergraphs $G_i\,(i=1,2)$. Step 1 augments the hypergraph using adaptive hypergraph view generators. Step 2 iteratively embeds views with H-HyperGNNs and trains the model using the two proposed semi-supervised and two contrastive losses.
  • Figure 3: Aggregation considering nodes homogeneity. (i) Each dashed circle represents a hyperedge, and each node in a circle represents an individual paper. (ii) Nodes in the same circle, or hyperedge, share an author. (iii) The hyperedge coloring represents distinct aggregation weights in CHGNN: darker colors indicate higher weights, i.e., $e_1$ and $e_3$ have higher aggregation weights than $e_2$ in (b). (iv) Nodes are colored to denote different domains, i.e., $v_2$, $v_3$, $v_4$, and $v_5$ belong to separate domains, unlike $v_1$, $v_2$, $v_6$, $v_7$, and $v_i$.
  • Figure 4: An example of enhanced contrastive loss training. (i) Each circle represents a sample. (ii) Circles with the same color as $q$ are in the same class as $q$. Circles with different colors are in different classes, with colors closer to $q$ indicating a higher similarity to $q$. (iii) Green arrows indicate attraction, while blue and red arrows indicate repulsion in CL, with red indicating stronger repulsion.
  • Figure 5: The impact of imbalanced labels.
  • ...and 1 more figures

Theorems & Definitions (17)

  • Example 1
  • Example 2
  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Definition 5
  • Definition 6
  • Example 3
  • Definition 7
  • ...and 7 more