Table of Contents
Fetching ...

A Survey on Natural Language Counterfactual Generation

Yongjie Wang, Xiaoqi Qiu, Yu Yue, Xu Guo, Zhiwei Zeng, Yuhong Feng, Zhiqi Shen

TL;DR

A new taxonomy is proposed that systematically categorizes the generation methods into four groups and summarizes the metrics for evaluating the generation quality, and discusses ongoing research challenges and outline promising directions for future work.

Abstract

Natural language counterfactual generation aims to minimally modify a given text such that the modified text will be classified into a different class. The generated counterfactuals provide insight into the reasoning behind a model's predictions by highlighting which words significantly influence the outcomes. Additionally, they can be used to detect model fairness issues and augment the training data to enhance the model's robustness. A substantial amount of research has been conducted to generate counterfactuals for various NLP tasks, employing different models and methodologies. With the rapid growth of studies in this field, a systematic review is crucial to guide future researchers and developers. To bridge this gap, this survey provides a comprehensive overview of textual counterfactual generation methods, particularly those based on Large Language Models. We propose a new taxonomy that systematically categorizes the generation methods into four groups and summarizes the metrics for evaluating the generation quality. Finally, we discuss ongoing research challenges and outline promising directions for future work.

A Survey on Natural Language Counterfactual Generation

TL;DR

A new taxonomy is proposed that systematically categorizes the generation methods into four groups and summarizes the metrics for evaluating the generation quality, and discusses ongoing research challenges and outline promising directions for future work.

Abstract

Natural language counterfactual generation aims to minimally modify a given text such that the modified text will be classified into a different class. The generated counterfactuals provide insight into the reasoning behind a model's predictions by highlighting which words significantly influence the outcomes. Additionally, they can be used to detect model fairness issues and augment the training data to enhance the model's robustness. A substantial amount of research has been conducted to generate counterfactuals for various NLP tasks, employing different models and methodologies. With the rapid growth of studies in this field, a systematic review is crucial to guide future researchers and developers. To bridge this gap, this survey provides a comprehensive overview of textual counterfactual generation methods, particularly those based on Large Language Models. We propose a new taxonomy that systematically categorizes the generation methods into four groups and summarizes the metrics for evaluating the generation quality. Finally, we discuss ongoing research challenges and outline promising directions for future work.
Paper Structure (20 sections, 6 equations, 4 figures, 4 tables)

This paper contains 20 sections, 6 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Use cases of counterfactual generation in various NLP tasks.
  • Figure 2: Demonstration of the Identify-and-then-Generate CFE generation.
  • Figure 3: Proportion of papers in each task among all collected papers. The term 'CLASS' refers to papers applicable to general text classification tasks, including SA and NLI.
  • Figure 4: The complete taxonomy proposed for existing literature on natural language counterfactual generation.