Table of Contents
Fetching ...

Momentum Contrastive Learning with Enhanced Negative Sampling and Hard Negative Filtering

Duy Hoang, Huy Ngo, Khoi Pham, Tri Nguyen, Gia Bao, Huy Phan

TL;DR

The paper tackles noisy negatives and underutilization of key-view embeddings in momentum-contrastive learning. It introduces a dual-view loss that balances optimization of query and key embeddings and a cosine-similarity-based hard negative filtering strategy to prioritize informative negatives. The extended loss formulations and filtering lead to improved downstream performance on CIFAR-10/100 with lower memory requirements, demonstrating robust, scalable unsupervised representations. This approach enhances the practicality of contrastive learning for cross-domain applications in computer vision and natural language processing.

Abstract

Contrastive learning has become pivotal in unsupervised representation learning, with frameworks like Momentum Contrast (MoCo) effectively utilizing large negative sample sets to extract discriminative features. However, traditional approaches often overlook the full potential of key embeddings and are susceptible to performance degradation from noisy negative samples in the memory bank. This study addresses these challenges by proposing an enhanced contrastive learning framework that incorporates two key innovations. First, we introduce a dual-view loss function, which ensures balanced optimization of both query and key embeddings, improving representation quality. Second, we develop a selective negative sampling strategy that emphasizes the most challenging negatives based on cosine similarity, mitigating the impact of noise and enhancing feature discrimination. Extensive experiments demonstrate that our framework achieves superior performance on downstream tasks, delivering robust and well-structured representations. These results highlight the potential of optimized contrastive mechanisms to advance unsupervised learning and extend its applicability across domains such as computer vision and natural language processing

Momentum Contrastive Learning with Enhanced Negative Sampling and Hard Negative Filtering

TL;DR

The paper tackles noisy negatives and underutilization of key-view embeddings in momentum-contrastive learning. It introduces a dual-view loss that balances optimization of query and key embeddings and a cosine-similarity-based hard negative filtering strategy to prioritize informative negatives. The extended loss formulations and filtering lead to improved downstream performance on CIFAR-10/100 with lower memory requirements, demonstrating robust, scalable unsupervised representations. This approach enhances the practicality of contrastive learning for cross-domain applications in computer vision and natural language processing.

Abstract

Contrastive learning has become pivotal in unsupervised representation learning, with frameworks like Momentum Contrast (MoCo) effectively utilizing large negative sample sets to extract discriminative features. However, traditional approaches often overlook the full potential of key embeddings and are susceptible to performance degradation from noisy negative samples in the memory bank. This study addresses these challenges by proposing an enhanced contrastive learning framework that incorporates two key innovations. First, we introduce a dual-view loss function, which ensures balanced optimization of both query and key embeddings, improving representation quality. Second, we develop a selective negative sampling strategy that emphasizes the most challenging negatives based on cosine similarity, mitigating the impact of noise and enhancing feature discrimination. Extensive experiments demonstrate that our framework achieves superior performance on downstream tasks, delivering robust and well-structured representations. These results highlight the potential of optimized contrastive mechanisms to advance unsupervised learning and extend its applicability across domains such as computer vision and natural language processing

Paper Structure

This paper contains 18 sections, 5 equations, 2 figures, 3 tables, 1 algorithm.

Figures (2)

  • Figure 1: Illustration of the enhanced negative sampling mechanism in MoCo. The query image ($x_{query}$) and the key image ($x_{key}$) are processed through the query encoder and momentum encoder, respectively, to generate feature embeddings $q$ and $k$. Negative samples are selected from the memory bank based on cosine distance, with the farthest samples prioritized for the key view. This approach balances both query and key views in the contrastive loss computation.
  • Figure 2: Impact of Symmetry Adjustments on Model Performance: Accuracy and Loss Curves Across Trials