Non-negative Contrastive Learning
Yifei Wang, Qi Zhang, Yaoyu Guo, Yisen Wang
TL;DR
Non-negative Contrastive Learning (NCL) imposes non-negativity on contrastive features to yield interpretable, sparse, and disentangled representations. It is shown to be theoretically equivalent to a non-negative matrix factorization objective and accompanied by identifiability and downstream-generalization guarantees, with a simple reparameterization (e.g., ReLU) that preserves CL performance. Empirically, NCL improves feature disentanglement, enables effective feature selection, and enhances downstream classification, including out-of-distribution robustness, while naturally extending to supervised and multi-modal settings via Non-negative Cross Entropy (NCE) and MMNCL. Overall, NCL offers a principled, scalable path to interpretable representations and broad applicability across SSL, supervised, and multi-modal learning.
Abstract
Deep representations have shown promising performance when transferred to downstream tasks in a black-box manner. Yet, their inherent lack of interpretability remains a significant challenge, as these features are often opaque to human understanding. In this paper, we propose Non-negative Contrastive Learning (NCL), a renaissance of Non-negative Matrix Factorization (NMF) aimed at deriving interpretable features. The power of NCL lies in its enforcement of non-negativity constraints on features, reminiscent of NMF's capability to extract features that align closely with sample clusters. NCL not only aligns mathematically well with an NMF objective but also preserves NMF's interpretability attributes, resulting in a more sparse and disentangled representation compared to standard contrastive learning (CL). Theoretically, we establish guarantees on the identifiability and downstream generalization of NCL. Empirically, we show that these advantages enable NCL to outperform CL significantly on feature disentanglement, feature selection, as well as downstream classification tasks. At last, we show that NCL can be easily extended to other learning scenarios and benefit supervised learning as well. Code is available at https://github.com/PKU-ML/non_neg.
