Table of Contents
Fetching ...

Alirector: Alignment-Enhanced Chinese Grammatical Error Corrector

Haihui Yang, Xiaojun Quan

TL;DR

This work tackles overcorrection in Chinese grammatical error correction by introducing Alirector, an alignment-enhanced corrector that combines an initial correction with bidirectional alignment and distills alignment knowledge back into the correction model. The approach applies to both Seq2Seq and decoder-only LLMs, using forward and reverse alignment to mitigate overcorrection and a KL-divergence-based distillation to transfer guidance. Across three CGEC datasets, Alirector achieves consistent improvements in F0.5, with notable precision gains and reduced overcorrection, though decoder-only LLMs still lag behind strong Seq2Seq baselines. The method offers a practical, architecture-agnostic solution to controllability in GEC and highlights the importance of alignment in enhancing robustness and fidelity of automated corrections.

Abstract

Chinese grammatical error correction (CGEC) faces serious overcorrection challenges when employing autoregressive generative models such as sequence-to-sequence (Seq2Seq) models and decoder-only large language models (LLMs). While previous methods aim to address overcorrection in Seq2Seq models, they are difficult to adapt to decoder-only LLMs. In this paper, we propose an alignment-enhanced corrector for the overcorrection problem that applies to both Seq2Seq models and decoder-only LLMs. Our method first trains a correction model to generate an initial correction of the source sentence. Then, we combine the source sentence with the initial correction and feed it through an alignment model for another round of correction, aiming to enforce the alignment model to focus on potential overcorrection. Moreover, to enhance the model's ability to identify nuances, we further explore the reverse alignment of the source sentence and the initial correction. Finally, we transfer the alignment knowledge from two alignment models to the correction model, instructing it on how to avoid overcorrection. Experimental results on three CGEC datasets demonstrate the effectiveness of our approach in alleviating overcorrection and improving overall performance. Our code has been made publicly available.

Alirector: Alignment-Enhanced Chinese Grammatical Error Corrector

TL;DR

This work tackles overcorrection in Chinese grammatical error correction by introducing Alirector, an alignment-enhanced corrector that combines an initial correction with bidirectional alignment and distills alignment knowledge back into the correction model. The approach applies to both Seq2Seq and decoder-only LLMs, using forward and reverse alignment to mitigate overcorrection and a KL-divergence-based distillation to transfer guidance. Across three CGEC datasets, Alirector achieves consistent improvements in F0.5, with notable precision gains and reduced overcorrection, though decoder-only LLMs still lag behind strong Seq2Seq baselines. The method offers a practical, architecture-agnostic solution to controllability in GEC and highlights the importance of alignment in enhancing robustness and fidelity of automated corrections.

Abstract

Chinese grammatical error correction (CGEC) faces serious overcorrection challenges when employing autoregressive generative models such as sequence-to-sequence (Seq2Seq) models and decoder-only large language models (LLMs). While previous methods aim to address overcorrection in Seq2Seq models, they are difficult to adapt to decoder-only LLMs. In this paper, we propose an alignment-enhanced corrector for the overcorrection problem that applies to both Seq2Seq models and decoder-only LLMs. Our method first trains a correction model to generate an initial correction of the source sentence. Then, we combine the source sentence with the initial correction and feed it through an alignment model for another round of correction, aiming to enforce the alignment model to focus on potential overcorrection. Moreover, to enhance the model's ability to identify nuances, we further explore the reverse alignment of the source sentence and the initial correction. Finally, we transfer the alignment knowledge from two alignment models to the correction model, instructing it on how to avoid overcorrection. Experimental results on three CGEC datasets demonstrate the effectiveness of our approach in alleviating overcorrection and improving overall performance. Our code has been made publicly available.
Paper Structure (42 sections, 9 equations, 6 figures, 10 tables)

This paper contains 42 sections, 9 equations, 6 figures, 10 tables.

Figures (6)

  • Figure 1: An illustration of addressing overcorrection through alignment of the source sentence and the initial correction. Overcorrected characters and their error-free counterparts are highlighted in red and orange, respectively. Correct edits are highlighted in blue.
  • Figure 2: Preliminary results of predict-and-align on NaCGEC ma-etal-2022-linguistic and FCGEC xu-etal-2022-fcgec datasets with Baichuan2-7B model baichuan2.
  • Figure 3: An overview of our proposed framework, which comprises three main steps. First, we train a correction model to generate an initial correction of the source sentence. Second, we perform bidirectional alignment by combining the source sentence with the initial correction forward and backward respectively, and passing each combination through an alignment model for another round of correction. Third, we employ knowledge distillation to transfer the knowledge from the two alignment models to the correction model.
  • Figure 4: Preliminary results of predict-and-align on NaCGEC and FCGEC datasets with Transformer and BART.
  • Figure 5: Results of precision for different error types, including missing (M), redundant (R), substitution (S), and word-order (W), on the FCGEC-Dev test set.
  • ...and 1 more figures