Table of Contents
Fetching ...

Elastic weight consolidation for better bias inoculation

James Thorne, Andreas Vlachos

TL;DR

Biases in sentence-pair classification can cause models to rely on spurious cues. Elastic Weight Consolidation is proposed to inoculate biases during fine-tuning while minimizing catastrophic forgetting, via a Fisher-information-based penalty. Across FEVER and MultiNLI, FT+EWC improves the bias-mitigated performance while preserving original task accuracy and synergizes with bias-modeling approaches like PoE and DFL. The results show robust bias mitigation with limited trade-offs, suggesting broad applicability to debiasing in sentence-pair tasks.

Abstract

The biases present in training datasets have been shown to affect models for sentence pair classification tasks such as natural language inference (NLI) and fact verification. While fine-tuning models on additional data has been used to mitigate them, a common issue is that of catastrophic forgetting of the original training dataset. In this paper, we show that elastic weight consolidation (EWC) allows fine-tuning of models to mitigate biases while being less susceptible to catastrophic forgetting. In our evaluation on fact verification and NLI stress tests, we show that fine-tuning with EWC dominates standard fine-tuning, yielding models with lower levels of forgetting on the original (biased) dataset for equivalent gains in accuracy on the fine-tuning (unbiased) dataset.

Elastic weight consolidation for better bias inoculation

TL;DR

Biases in sentence-pair classification can cause models to rely on spurious cues. Elastic Weight Consolidation is proposed to inoculate biases during fine-tuning while minimizing catastrophic forgetting, via a Fisher-information-based penalty. Across FEVER and MultiNLI, FT+EWC improves the bias-mitigated performance while preserving original task accuracy and synergizes with bias-modeling approaches like PoE and DFL. The results show robust bias mitigation with limited trade-offs, suggesting broad applicability to debiasing in sentence-pair tasks.

Abstract

The biases present in training datasets have been shown to affect models for sentence pair classification tasks such as natural language inference (NLI) and fact verification. While fine-tuning models on additional data has been used to mitigate them, a common issue is that of catastrophic forgetting of the original training dataset. In this paper, we show that elastic weight consolidation (EWC) allows fine-tuning of models to mitigate biases while being less susceptible to catastrophic forgetting. In our evaluation on fact verification and NLI stress tests, we show that fine-tuning with EWC dominates standard fine-tuning, yielding models with lower levels of forgetting on the original (biased) dataset for equivalent gains in accuracy on the fine-tuning (unbiased) dataset.

Paper Structure

This paper contains 31 sections, 1 equation, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Hypothesis only bias in FEVER contributes to low accuracy when testing against counterfactual evidence. This is mitigated by fine-tuning on counterfactual evidence. Catastrophic forgetting from fine-tuning is reduced when using elastic weight consolidation (EWC), preserving the original task accuracy.
  • Figure 2: Pareto frontiers of fine-tuning BERT and ESIM models showing FT+EWC dominates FT.
  • Figure 3: Training curves fine tuning MultiNLI with stress-test data. Solid lines indicate challenge dataset accuracy. Dashed lines indicate MultiNLI accuracy.