Table of Contents
Fetching ...

Affinity Uncertainty-based Hard Negative Mining in Graph Contrastive Learning

Chaoxi Niu, Guansong Pang, Ling Chen

TL;DR

This work addresses the difficulty of identifying true hard negatives in graph contrastive learning due to non-i.i.d. graph structure and oversmoothing. It introduces AUGCL, which learns an anchor-dependent hardness score for negatives by modeling affinity uncertainty across two negative groups, and integrates this as a weighting term in contrastive losses, with a theoretical equivalence to a triplet loss with an adaptive margin $m_{ij} = \frac{\tau}{2} \log(\alpha u_{ij})$. The framework combines an anchor-dependent binary partition of negatives with a Deep Gambler-based uncertainty estimator to produce a per-anchor uncertainty matrix $\mathbf{U}$ that informs $w_{ij} = \alpha \phi_i(\widehat{z}_j; \Theta)$. Empirically, AUGCL consistently improves graph- and node-classification performance across ten graph datasets, enhances robustness to graph adversarial attacks, and demonstrates favorable ablations and stable hyperparameter behavior. The method is data-driven, eliminates prior assumptions used by some prior hardness methods, and readily plugs into existing GCL losses, with code available at the provided repository.

Abstract

Hard negative mining has shown effective in enhancing self-supervised contrastive learning (CL) on diverse data types, including graph CL (GCL). The existing hardness-aware CL methods typically treat negative instances that are most similar to the anchor instance as hard negatives, which helps improve the CL performance, especially on image data. However, this approach often fails to identify the hard negatives but leads to many false negatives on graph data. This is mainly due to that the learned graph representations are not sufficiently discriminative due to oversmooth representations and/or non-independent and identically distributed (non-i.i.d.) issues in graph data. To tackle this problem, this article proposes a novel approach that builds a discriminative model on collective affinity information (i.e., two sets of pairwise affinities between the negative instances and the anchor instance) to mine hard negatives in GCL. In particular, the proposed approach evaluates how confident/uncertain the discriminative model is about the affinity of each negative instance to an anchor instance to determine its hardness weight relative to the anchor instance. This uncertainty information is then incorporated into the existing GCL loss functions via a weighting term to enhance their performance. The enhanced GCL is theoretically grounded that the resulting GCL loss is equivalent to a triplet loss with an adaptive margin being exponentially proportional to the learned uncertainty of each negative instance. Extensive experiments on ten graph datasets show that our approach does the following: 1) consistently enhances different state-of-the-art (SOTA) GCL methods in both graph and node classification tasks and 2) significantly improves their robustness against adversarial attacks. Code is available at https://github.com/mala-lab/AUGCL.

Affinity Uncertainty-based Hard Negative Mining in Graph Contrastive Learning

TL;DR

This work addresses the difficulty of identifying true hard negatives in graph contrastive learning due to non-i.i.d. graph structure and oversmoothing. It introduces AUGCL, which learns an anchor-dependent hardness score for negatives by modeling affinity uncertainty across two negative groups, and integrates this as a weighting term in contrastive losses, with a theoretical equivalence to a triplet loss with an adaptive margin . The framework combines an anchor-dependent binary partition of negatives with a Deep Gambler-based uncertainty estimator to produce a per-anchor uncertainty matrix that informs . Empirically, AUGCL consistently improves graph- and node-classification performance across ten graph datasets, enhances robustness to graph adversarial attacks, and demonstrates favorable ablations and stable hyperparameter behavior. The method is data-driven, eliminates prior assumptions used by some prior hardness methods, and readily plugs into existing GCL losses, with code available at the provided repository.

Abstract

Hard negative mining has shown effective in enhancing self-supervised contrastive learning (CL) on diverse data types, including graph CL (GCL). The existing hardness-aware CL methods typically treat negative instances that are most similar to the anchor instance as hard negatives, which helps improve the CL performance, especially on image data. However, this approach often fails to identify the hard negatives but leads to many false negatives on graph data. This is mainly due to that the learned graph representations are not sufficiently discriminative due to oversmooth representations and/or non-independent and identically distributed (non-i.i.d.) issues in graph data. To tackle this problem, this article proposes a novel approach that builds a discriminative model on collective affinity information (i.e., two sets of pairwise affinities between the negative instances and the anchor instance) to mine hard negatives in GCL. In particular, the proposed approach evaluates how confident/uncertain the discriminative model is about the affinity of each negative instance to an anchor instance to determine its hardness weight relative to the anchor instance. This uncertainty information is then incorporated into the existing GCL loss functions via a weighting term to enhance their performance. The enhanced GCL is theoretically grounded that the resulting GCL loss is equivalent to a triplet loss with an adaptive margin being exponentially proportional to the learned uncertainty of each negative instance. Extensive experiments on ten graph datasets show that our approach does the following: 1) consistently enhances different state-of-the-art (SOTA) GCL methods in both graph and node classification tasks and 2) significantly improves their robustness against adversarial attacks. Code is available at https://github.com/mala-lab/AUGCL.
Paper Structure (27 sections, 1 theorem, 7 equations, 4 figures, 7 tables)

This paper contains 27 sections, 1 theorem, 7 equations, 4 figures, 7 tables.

Key Result

Theorem 1

Let $u_{ij}=\phi_i(\widehat{z}_j;\Theta)$ be the affinity uncertainty-based hardness of a negative instance $\widehat{z}_j$ w.r.t. the anchor instance $\widetilde{z}_i$. When the projection function is an identity function and assumes the positive instance is more similar to the anchor than the nega where $\widetilde{z}_i^{'}$ is the normalized embedding.

Figures (4)

  • Figure 1: (a): Two groups of data instances in blue and orange. (b): The affinity uncertainty-based hardness results learned by our approach using instance 11 or 26 as the anchor instance. Instances with a larger uncertainty are more likely to be hard negative samples w.r.t. the anchor instance. (c): The histograms of the similarity of the instances to the anchor instance 11. It is clear that treating the most similar instances to the anchor as the hard negatives can lead to many false negatives. (d): The uncertainty results learned by our approach for the instances w.r.t the anchor instance 11, where true negatives including hard negatives have large uncertainty values (and thus large hardness weights) while false negative cases receive very small uncertainty values.
  • Figure 2: Overview of our approach AUGCL. Left: AUGCL-based graph contrastive learning. The objective and the general procedures are the same as existing GCL methods, but AUGCL leverages affinity uncertainty to learn anchor-instance-dependent hardness-based instance weights $\{w_{i1},w_{i2},\cdots, w_{iN}\}$ for all negative instances to improve existing GCL methods. Right: The proposed affinity uncertainty learning approach to obtain the weights. For an anchor $\widetilde{z}_i$, AUGCL first obtains collective affinity information (i.e., pairwise affinity across the instances) via binary partition of its negative instances. It then utilizes those affinity information to learn an uncertainty estimator that evaluates how confident the estimator is about the affinity of each negative instance $\widehat{z}_j$ relative to the anchor instance $\widetilde{z}_i$. A larger affinity uncertainty value $u_{ij}$ indicates more likely of $\widehat{z}_j$ being a hard negative, and thus, a larger weight $w_{ij}$ ($w_{ij}=\alpha u_{ij}$ where $\alpha$ is a hyperparameter).
  • Figure 3: Sensitivity analysis of hyperparameters $\alpha$ and $o$.
  • Figure 4: Loss curve of the proposed method.

Theorems & Definitions (2)

  • Definition 1: Affinity Uncertainty
  • Theorem 1