Table of Contents
Fetching ...

Learn Beneficial Noise as Graph Augmentation

Siqi Huang, Yanchen Xu, Hongyuan Zhang, Xuelong Li

TL;DR

This work tackles the instability of graph augmentations in graph contrastive learning by introducing PiNGDA, a framework that learns beneficial perturbations through the information-theoretic notion of $\pi$-noise. A Gaussian auxiliary variable ties the GCL loss to information entropy, revealing that standard predefined augmentations approximate a point estimate of $\pi$-noise and motivating trainable generators for both topology and node attributes. The method derives a differentiable loss $\mathcal{L}_{\pi}$ and demonstrates improved performance and stability across node, graph, and heterogeneous graph tasks, with favorable efficiency and interpretability of the learned augmentations. Overall, PiNGDA provides a principled, adaptable augmentation strategy that improves robustness and generalization in graph representation learning with broad practical impact for downstream graph tasks.

Abstract

Although graph contrastive learning (GCL) has been widely investigated, it is still a challenge to generate effective and stable graph augmentations. Existing methods often apply heuristic augmentation like random edge dropping, which may disrupt important graph structures and result in unstable GCL performance. In this paper, we propose Positive-incentive Noise driven Graph Data Augmentation (PiNGDA), where positive-incentive noise (pi-noise) scientifically analyzes the beneficial effect of noise under the information theory. To bridge the standard GCL and pi-noise framework, we design a Gaussian auxiliary variable to convert the loss function to information entropy. We prove that the standard GCL with pre-defined augmentations is equivalent to estimate the beneficial noise via the point estimation. Following our analysis, PiNGDA is derived from learning the beneficial noise on both topology and attributes through a trainable noise generator for graph augmentations, instead of the simple estimation. Since the generator learns how to produce beneficial perturbations on graph topology and node attributes, PiNGDA is more reliable compared with the existing methods. Extensive experimental results validate the effectiveness and stability of PiNGDA.

Learn Beneficial Noise as Graph Augmentation

TL;DR

This work tackles the instability of graph augmentations in graph contrastive learning by introducing PiNGDA, a framework that learns beneficial perturbations through the information-theoretic notion of -noise. A Gaussian auxiliary variable ties the GCL loss to information entropy, revealing that standard predefined augmentations approximate a point estimate of -noise and motivating trainable generators for both topology and node attributes. The method derives a differentiable loss and demonstrates improved performance and stability across node, graph, and heterogeneous graph tasks, with favorable efficiency and interpretability of the learned augmentations. Overall, PiNGDA provides a principled, adaptable augmentation strategy that improves robustness and generalization in graph representation learning with broad practical impact for downstream graph tasks.

Abstract

Although graph contrastive learning (GCL) has been widely investigated, it is still a challenge to generate effective and stable graph augmentations. Existing methods often apply heuristic augmentation like random edge dropping, which may disrupt important graph structures and result in unstable GCL performance. In this paper, we propose Positive-incentive Noise driven Graph Data Augmentation (PiNGDA), where positive-incentive noise (pi-noise) scientifically analyzes the beneficial effect of noise under the information theory. To bridge the standard GCL and pi-noise framework, we design a Gaussian auxiliary variable to convert the loss function to information entropy. We prove that the standard GCL with pre-defined augmentations is equivalent to estimate the beneficial noise via the point estimation. Following our analysis, PiNGDA is derived from learning the beneficial noise on both topology and attributes through a trainable noise generator for graph augmentations, instead of the simple estimation. Since the generator learns how to produce beneficial perturbations on graph topology and node attributes, PiNGDA is more reliable compared with the existing methods. Extensive experimental results validate the effectiveness and stability of PiNGDA.

Paper Structure

This paper contains 32 sections, 25 equations, 5 figures, 9 tables, 1 algorithm.

Figures (5)

  • Figure 1: This figure illustrates the PiNGDA framework, which consists of a $\pi$-noise generator and contrastive learning module. The core innovation lies in its joint noise generator, which consists of two synergistic components: a topological generator for perturbing graph structures and an attribute generator for perturbing node features. A contrastive loss function is applied between original and noise graph representations. By jointly training the generator and encoder, PiNGDA dynamically learns optimal perturbation patterns tailored to downstream tasks.
  • Figure 2: Comparison in terms of training time of one epoch, and memory costs between different graph representation learning methods. On Amazon-Photo, the model is trained with a batch size of 256 due to the memory limit.
  • Figure 3: Visualization of noise in graph data. Nodes are selected from the graph which are colored according to their labels, edges are colored to differentiate between intra-class (solid gray) and inter-class (dashed red) relationships. The node layout is determined by their connectivity relationships. The width of the edges corresponds to the learned weights.
  • Figure 4: The effect of the temperature $\tau$.
  • Figure 5: The effect of the hidden dim.