Data Augmentation of Contrastive Learning is Estimating Positive-incentive Noise
Hongyuan Zhang, Yanchen Xu, Sida Huang, Xuelong Li
TL;DR
This work reframes data augmentation in contrastive learning as learning beneficial noise, introducing Positive-incentive Noise (Pi-noise) and its Gaussian surrogate to connect augmentation to task mutual information. It proves that standard augmentations are effectively point estimates of Pi-noise and develops PiNDA, a learnable Pi-noise generator that produces augmentation views without assuming data type, while remaining compatible with existing contrastive models. Theoretical developments link task entropy to the contrastive loss via an auxiliary variable, enabling a practical optimization that reduces to maximizing a contrastive objective while learning the augmentation distribution. Experimental results on non-vision and vision datasets show PiNDA improves classification or retrieval metrics, with visualizations illustrating that the learned augmentations resemble meaningful style or background changes and converge rapidly. Overall, PiNDA offers a general, unsupervised approach to data augmentation that can extend contrastive learning to diverse domains and improve augmentation stability and effectiveness.
Abstract
Inspired by the idea of Positive-incentive Noise (Pi-Noise or $π$-Noise) that aims at learning the reliable noise beneficial to tasks, we scientifically investigate the connection between contrastive learning and $π$-noise in this paper. By converting the contrastive loss to an auxiliary Gaussian distribution to quantitatively measure the difficulty of the specific contrastive model under the information theory framework, we properly define the task entropy, the core concept of $π$-noise, of contrastive learning. It is further proved that the predefined data augmentation in the standard contrastive learning paradigm can be regarded as a kind of point estimation of $π$-noise. Inspired by the theoretical study, a framework that develops a $π$-noise generator to learn the beneficial noise (instead of estimation) as data augmentations for contrast is proposed. The designed framework can be applied to diverse types of data and is also completely compatible with the existing contrastive models. From the visualization, we surprisingly find that the proposed method successfully learns effective augmentations.
