Table of Contents
Fetching ...

Learning from Mistakes: Self-correct Adversarial Training for Chinese Unnatural Text Correction

Xuan Feng, Tianlong Gu, Xiaoli Liu, Liang Chang

TL;DR

This work tackles the exposure bias and robustness challenges in unnatural text correction by introducing LIMIT, a self-correct adversarial training framework. LIMIT combines a generative correction mechanism with self-generated adversarial examples and a decoding intervention to preserve semantic consistency, enabling robust correction of multi-type errors across Chinese and English data. Empirical results show LIMIT achieving state-of-the-art or near-state-of-the-art performance on Chinese UTC, Chinese and English NLU, and Chinese NLG tasks, with strong transferability to new models and datasets and improved resistance to adversarial perturbations. The method offers a practical, plug-in defense for diverse NLP tasks, improving reliability and trust in real-world text processing systems.

Abstract

Unnatural text correction aims to automatically detect and correct spelling errors or adversarial perturbation errors in sentences. Existing methods typically rely on fine-tuning or adversarial training to correct errors, which have achieved significant success. However, these methods exhibit poor generalization performance due to the difference in data distribution between training data and real-world scenarios, known as the exposure bias problem. In this paper, we propose a self-correct adversarial training framework for \textbf{L}earn\textbf{I}ng from \textbf{MI}s\textbf{T}akes (\textbf{LIMIT}), which is a task- and model-independent framework to correct unnatural errors or mistakes. Specifically, we fully utilize errors generated by the model that are actively exposed during the inference phase, i.e., predictions that are inconsistent with the target. This training method not only simulates potential errors in real application scenarios, but also mitigates the exposure bias of the traditional training process. Meanwhile, we design a novel decoding intervention strategy to maintain semantic consistency. Extensive experimental results on Chinese unnatural text error correction datasets show that our proposed method can correct multiple forms of errors and outperforms the state-of-the-art text correction methods. In addition, extensive results on Chinese and English datasets validate that LIMIT can serve as a plug-and-play defense module and can extend to new models and datasets without further training.

Learning from Mistakes: Self-correct Adversarial Training for Chinese Unnatural Text Correction

TL;DR

This work tackles the exposure bias and robustness challenges in unnatural text correction by introducing LIMIT, a self-correct adversarial training framework. LIMIT combines a generative correction mechanism with self-generated adversarial examples and a decoding intervention to preserve semantic consistency, enabling robust correction of multi-type errors across Chinese and English data. Empirical results show LIMIT achieving state-of-the-art or near-state-of-the-art performance on Chinese UTC, Chinese and English NLU, and Chinese NLG tasks, with strong transferability to new models and datasets and improved resistance to adversarial perturbations. The method offers a practical, plug-in defense for diverse NLP tasks, improving reliability and trust in real-world text processing systems.

Abstract

Unnatural text correction aims to automatically detect and correct spelling errors or adversarial perturbation errors in sentences. Existing methods typically rely on fine-tuning or adversarial training to correct errors, which have achieved significant success. However, these methods exhibit poor generalization performance due to the difference in data distribution between training data and real-world scenarios, known as the exposure bias problem. In this paper, we propose a self-correct adversarial training framework for \textbf{L}earn\textbf{I}ng from \textbf{MI}s\textbf{T}akes (\textbf{LIMIT}), which is a task- and model-independent framework to correct unnatural errors or mistakes. Specifically, we fully utilize errors generated by the model that are actively exposed during the inference phase, i.e., predictions that are inconsistent with the target. This training method not only simulates potential errors in real application scenarios, but also mitigates the exposure bias of the traditional training process. Meanwhile, we design a novel decoding intervention strategy to maintain semantic consistency. Extensive experimental results on Chinese unnatural text error correction datasets show that our proposed method can correct multiple forms of errors and outperforms the state-of-the-art text correction methods. In addition, extensive results on Chinese and English datasets validate that LIMIT can serve as a plug-and-play defense module and can extend to new models and datasets without further training.

Paper Structure

This paper contains 37 sections, 12 equations, 3 figures, 10 tables.

Figures (3)

  • Figure 1: Examples of various unnatural text error types, the red characters are characters with errors, while the blue characters are correct characters. For easier understanding, pinyin errors in Chinese are represented by phonetic symbols in English (two $\to$ tu:).
  • Figure 2: The overall correction process of LIMIT. For easier understanding, pinyin errors in Chinese are represented by phonetic symbols in English.
  • Figure 3: Relationship between $\alpha$ and BLEU under different training losses on the Chinese unnatural text correction dataset (Hybrid).