Maximizing Incremental Information Entropy for Contrastive Learning

Jiansong Zhang; Zhuoqin Yang; Xu Wu; Xiaoling Luo; Peizhong Liu; Linlin Shen

Maximizing Incremental Information Entropy for Contrastive Learning

Jiansong Zhang, Zhuoqin Yang, Xu Wu, Xiaoling Luo, Peizhong Liu, Linlin Shen

Abstract

Contrastive learning has achieved remarkable success in self-supervised representation learning, often guided by information-theoretic objectives such as mutual information maximization. Motivated by the limitations of static augmentations and rigid invariance constraints, we propose IE-CL (Incremental-Entropy Contrastive Learning), a framework that explicitly optimizes the entropy gain between augmented views while preserving semantic consistency. Our theoretical framework reframes the challenge by identifying the encoder as an information bottleneck and proposes a joint optimization of two components: a learnable transformation for entropy generation and an encoder regularizer for its preservation. Experiments on CIFAR-10/100, STL-10, and ImageNet demonstrate that IE-CL consistently improves performance under small-batch settings. Moreover, our core modules can be seamlessly integrated into existing frameworks. This work bridges theoretical principles and practice, offering a new perspective in contrastive learning.

Maximizing Incremental Information Entropy for Contrastive Learning

Abstract

Paper Structure (39 sections, 2 theorems, 29 equations, 8 figures, 7 tables, 1 algorithm)

This paper contains 39 sections, 2 theorems, 29 equations, 8 figures, 7 tables, 1 algorithm.

Introduction
Related Work
Self-supervised Paradigm
Contrastive Learning Theory
Method
Information Entropy in Contrastive Learning
Contrastive Learning Objectives
Mutual Information Theory
Incremental Entropy in Contrastive Learning
Maximizing Incremental Information Entropy
Sample Augmentation Incremental Block (SAIB)
KL regularisation to avoid degenerate $g_\phi$.
Overall objective.
Experiment & Result
Experimental Setup
...and 24 more sections

Key Result

Lemma 3.1

Let $Z = f_\theta(X)$ be the embedding of input $X$ and $Z^+$ the corresponding positive sample. Then, based on the Donsker--Varadhan representation, the mutual information satisfies

Figures (8)

Figure 1: Overview of the proposed IE-CL. We define incremental entropy as the absolute change in entropy induced by classical contrastive augmentations (see Definition \ref{['def3.2']}). To optimize the contrastive learning process, we propose the Sample Augmentation Incremental Block (SAIB), a learnable module that ensures the local Jacobian determinant > 1. By incorporating sample-level incremental entropy into contrastive optimization, we establish a principled framework that improves the effectiveness of self-supervised representation learning.
Figure 2: Illustration of the data augmentation operators studied. The non-isometric transformation operator SAIB has learnable parameters, enabling non-prior augmentation for contrastive learning. Visualizing changes from 100 epochs (d) to 400 epochs (h) shows that KL divergence effectively constrains incremental entropy, preventing collapse.
Figure 3: Ablation tests the relationship between SAIB and the previous pretext task. The image was resized to 224$\times$224, and augmentation strength settings from pmlr-v119-chen20j were applied, followed by two-by-two tests with SAIB placed on both sides of the contrastive learning.
Figure 4: Comparison of SSL training loss drop curves based on the proposed maximized incremental information entropy (SAIB) on ImageNet-1K, using MoCo-v2 as the baseline.
Figure 5: The variation of the incremental entropy $\Delta H(X)$ on the Query side and InfoNCE throughout the iterations is shown.
...and 3 more figures

Theorems & Definitions (6)

Lemma 3.1: Equivalence between InfoNCE minimization and mutual information maximization
proof
Definition 3.2: Based on the concept of Shannon Entropy, the change in information entropy of a given sample $X$ after a transformation $g$ is applied, resulting in $X'$, is referred to as the Incremental Information Entropy
proof
Proposition 3.3: Principle of Constrained Incremental Entropy Maximization
proof : Theoretical Argument

Maximizing Incremental Information Entropy for Contrastive Learning

Abstract

Maximizing Incremental Information Entropy for Contrastive Learning

Authors

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (6)