Table of Contents
Fetching ...

PairCFR: Enhancing Model Training on Paired Counterfactually Augmented Data through Contrastive Learning

Xiaoqi Qiu, Yongjie Wang, Xu Guo, Zhiwei Zeng, Yue Yu, Yuhong Feng, Chunyan Miao

TL;DR

This work tackles the tendency of CAD-trained models to overfit to edited features by introducing PairCFR, a training framework that pairs original samples with counterfactuals and optimizes a joint loss combining cross-entropy with contrastive learning. The theoretical analysis and gradient insights show that contrastive loss encourages a broader feature usage beyond the edited components, improving out-of-distribution generalization. Empirical results on two human-edited CAD datasets for SA and NLI demonstrate that PairCFR achieves superior OOD performance across multiple backbone models and remains effective in few-shot scenarios, with ablations highlighting the importance of pairing strategy and CL. The method offers a principled way to harness CAD for robust generalization while mitigating overreliance on counterfactual edits, with practical implications for building more reliable NLP systems.

Abstract

Counterfactually Augmented Data (CAD) involves creating new data samples by applying minimal yet sufficient modifications to flip the label of existing data samples to other classes. Training with CAD enhances model robustness against spurious features that happen to correlate with labels by spreading the casual relationships across different classes. Yet, recent research reveals that training with CAD may lead models to overly focus on modified features while ignoring other important contextual information, inadvertently introducing biases that may impair performance on out-ofdistribution (OOD) datasets. To mitigate this issue, we employ contrastive learning to promote global feature alignment in addition to learning counterfactual clues. We theoretically prove that contrastive loss can encourage models to leverage a broader range of features beyond those modified ones. Comprehensive experiments on two human-edited CAD datasets demonstrate that our proposed method outperforms the state-of-the-art on OOD datasets.

PairCFR: Enhancing Model Training on Paired Counterfactually Augmented Data through Contrastive Learning

TL;DR

This work tackles the tendency of CAD-trained models to overfit to edited features by introducing PairCFR, a training framework that pairs original samples with counterfactuals and optimizes a joint loss combining cross-entropy with contrastive learning. The theoretical analysis and gradient insights show that contrastive loss encourages a broader feature usage beyond the edited components, improving out-of-distribution generalization. Empirical results on two human-edited CAD datasets for SA and NLI demonstrate that PairCFR achieves superior OOD performance across multiple backbone models and remains effective in few-shot scenarios, with ablations highlighting the importance of pairing strategy and CL. The method offers a principled way to harness CAD for robust generalization while mitigating overreliance on counterfactual edits, with practical implications for building more reliable NLP systems.

Abstract

Counterfactually Augmented Data (CAD) involves creating new data samples by applying minimal yet sufficient modifications to flip the label of existing data samples to other classes. Training with CAD enhances model robustness against spurious features that happen to correlate with labels by spreading the casual relationships across different classes. Yet, recent research reveals that training with CAD may lead models to overly focus on modified features while ignoring other important contextual information, inadvertently introducing biases that may impair performance on out-ofdistribution (OOD) datasets. To mitigate this issue, we employ contrastive learning to promote global feature alignment in addition to learning counterfactual clues. We theoretically prove that contrastive loss can encourage models to leverage a broader range of features beyond those modified ones. Comprehensive experiments on two human-edited CAD datasets demonstrate that our proposed method outperforms the state-of-the-art on OOD datasets.
Paper Structure (29 sections, 14 equations, 5 figures, 8 tables)

This paper contains 29 sections, 14 equations, 5 figures, 8 tables.

Figures (5)

  • Figure 1: The overall learning framework.
  • Figure 2: Few-shot learning results of BERT$_\mathbf{base}$ on NLI. $x$-axis represents the number of training samples and $y$-axis represents the averaged accuracy and standard deviation on ID and OODs.
  • Figure 3: Test results for fine-tuning BERT$_\mathbf{base}$ on IMDb augmented data (left) and T5$_\mathbf{base}$ on SNLI augmented data (right) with respect to the batch size.
  • Figure 4: The ID and OOD performance of the BERT$_{base}$ models trained on full CAD for IMDb and SNLI tasks. Grey areas indicate the best hyperparameter settings for $\lambda$ or $\tau$.
  • Figure 5: Few-shot learning results of BERT$_\mathbf{base}$ on SA. $x$-axis represents the number of training samples and $y$-axis represents the averaged accuracy and standard deviation on ID and OODs.