Dissecting Representation Misalignment in Contrastive Learning via Influence Function
Lijie Hu, Chenyang Ren, Huanyi Xie, Khouloud Saadi, Shu Yang, Zhen Tan, Jingfeng Zhang, Di Wang
TL;DR
This work tackles robustness and hallucination issues in contrastive learning arising from misaligned web-sourced multimodal data by introducing ECIF, the Extended Influence Function for Contrastive Loss. ECIF provides a closed-form, retraining-free data-valuation framework that accounts for both positive and negative samples, enabling precise data attribution in CLIP-style embeddings. Building on ECIF, the authors develop algorithms for misalignment detection, misprediction trace-back, and data cleaning, with empirical results showing substantial runtime savings while preserving or improving accuracy. The approach enhances dataset transparency and offers practical tools for targeted fine-tuning and robust data curation in large-scale multimodal models.
Abstract
Contrastive learning, commonly applied in large-scale multimodal models, often relies on data from diverse and often unreliable sources, which can include misaligned or mislabeled text-image pairs. This frequently leads to robustness issues and hallucinations, ultimately causing performance degradation. Data valuation is an efficient way to detect and trace these misalignments. Nevertheless, existing methods are computationally expensive for large-scale models. Although computationally efficient, classical influence functions are inadequate for contrastive learning models, as they were initially designed for pointwise loss. Furthermore, contrastive learning involves minimizing the distance between positive sample modalities while maximizing the distance between negative sample modalities. This necessitates evaluating the influence of samples from both perspectives. To tackle these challenges, we introduce the Extended Influence Function for Contrastive Loss (ECIF), an influence function crafted for contrastive loss. ECIF considers both positive and negative samples and provides a closed-form approximation of contrastive learning models, eliminating the need for retraining. Building upon ECIF, we develop a series of algorithms for data evaluation, misalignment detection, and misprediction trace-back tasks. Experimental results demonstrate our ECIF advances the transparency and interpretability of CLIP-style embedding models by offering a more accurate assessment of data impact and model alignment compared to traditional baseline methods.
