Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study
Tao Ge, Furu Wei, Ming Zhou
TL;DR
This study tackles data scarcity and single-pass limitations in grammatical error correction by introducing fluency boost learning and fluency boost inference. It presents back-boost, self-boost, and dual-boost strategies to augment training with diverse, fluent–disfluent sentence pairs, and extends to large-scale native data. It further proposes multi-round and round-way inference using left-to-right and right-to-left decoders to incrementally improve fluency and recall. The resulting convolutional seq2seq GEC system achieves state-of-the-art results and reaches human-level performance on CoNLL-2014 and JFLEG, demonstrating the practical impact of data-augmented training and iterative editing for GEC.
Abstract
Neural sequence-to-sequence (seq2seq) approaches have proven to be successful in grammatical error correction (GEC). Based on the seq2seq framework, we propose a novel fluency boost learning and inference mechanism. Fluency boosting learning generates diverse error-corrected sentence pairs during training, enabling the error correction model to learn how to improve a sentence's fluency from more instances, while fluency boosting inference allows the model to correct a sentence incrementally with multiple inference steps. Combining fluency boost learning and inference with convolutional seq2seq models, our approach achieves the state-of-the-art performance: 75.72 (F_{0.5}) on CoNLL-2014 10 annotation dataset and 62.42 (GLEU) on JFLEG test set respectively, becoming the first GEC system that reaches human-level performance (72.58 for CoNLL and 62.37 for JFLEG) on both of the benchmarks.
