Table of Contents
Fetching ...

ArbESC+: Arabic Enhanced Edit Selection System Combination for Grammatical Error Correction Resolving conflict and improving system combination in Arabic GEC

Ahlam Alrehili, Areej Alhothali

TL;DR

This work tackles Arabic Grammatical Error Correction (GEC) by introducing ArbESC+, a two-stage system that fuses outputs from nine Arabic-focused models and text-editing systems. It casts system combination as edit-level binary classification, employing agreement boosting, dual-threshold filtering, and Non-Maximum Suppression on 1D spans to resolve conflicts and select high-quality edits. Empirical results on QALB-2014/2015 show state-of-the-art performance (e.g., $F_{0.5}$ improvements over strong single models and ensemble baselines), confirming the value of targeted conflict resolution in Arabic GEC. The approach demonstrates stability across datasets and suggests wider applicability to low-resource languages where data scarcity and complex morphology pose challenges.

Abstract

Grammatical Error Correction (GEC) is an important aspect of natural language processing. Arabic has a complicated morphological and syntactic structure, posing a greater challenge than other languages. Even though modern neural models have improved greatly in recent years, the majority of previous attempts used individual models without taking into account the potential benefits of combining different systems. In this paper, we present one of the first multi-system approaches for correcting grammatical errors in Arabic, the Arab Enhanced Edit Selection System Complication (ArbESC+). Several models are used to collect correction proposals, which are represented as numerical features in the framework. A classifier determines and implements the appropriate corrections based on these features. In order to improve output quality, the framework uses support techniques to filter overlapping corrections and estimate decision reliability. A combination of AraT5, ByT5, mT5, AraBART, AraBART+Morph+GEC, and Text editing systems gave better results than a single model alone, with F0.5 at 82.63% on QALB-14 test data, 84.64% on QALB-15 L1 data, and 65.55% on QALB-15 L2 data. As one of the most significant contributions of this work, it's the first Arab attempt to integrate linguistic error correction. Improving existing models provides a practical step towards developing advanced tools that will benefit users and researchers of Arabic text processing.

ArbESC+: Arabic Enhanced Edit Selection System Combination for Grammatical Error Correction Resolving conflict and improving system combination in Arabic GEC

TL;DR

This work tackles Arabic Grammatical Error Correction (GEC) by introducing ArbESC+, a two-stage system that fuses outputs from nine Arabic-focused models and text-editing systems. It casts system combination as edit-level binary classification, employing agreement boosting, dual-threshold filtering, and Non-Maximum Suppression on 1D spans to resolve conflicts and select high-quality edits. Empirical results on QALB-2014/2015 show state-of-the-art performance (e.g., improvements over strong single models and ensemble baselines), confirming the value of targeted conflict resolution in Arabic GEC. The approach demonstrates stability across datasets and suggests wider applicability to low-resource languages where data scarcity and complex morphology pose challenges.

Abstract

Grammatical Error Correction (GEC) is an important aspect of natural language processing. Arabic has a complicated morphological and syntactic structure, posing a greater challenge than other languages. Even though modern neural models have improved greatly in recent years, the majority of previous attempts used individual models without taking into account the potential benefits of combining different systems. In this paper, we present one of the first multi-system approaches for correcting grammatical errors in Arabic, the Arab Enhanced Edit Selection System Complication (ArbESC+). Several models are used to collect correction proposals, which are represented as numerical features in the framework. A classifier determines and implements the appropriate corrections based on these features. In order to improve output quality, the framework uses support techniques to filter overlapping corrections and estimate decision reliability. A combination of AraT5, ByT5, mT5, AraBART, AraBART+Morph+GEC, and Text editing systems gave better results than a single model alone, with F0.5 at 82.63% on QALB-14 test data, 84.64% on QALB-15 L1 data, and 65.55% on QALB-15 L2 data. As one of the most significant contributions of this work, it's the first Arab attempt to integrate linguistic error correction. Improving existing models provides a practical step towards developing advanced tools that will benefit users and researchers of Arabic text processing.

Paper Structure

This paper contains 42 sections, 7 equations, 9 figures, 14 tables, 1 algorithm.

Figures (9)

  • Figure 1: Macro average $F_{0.5}$ across all QALB test sets. Labels are rotated to avoid overlap; the overall mean is shown as a dashed line with a legend.
  • Figure 2: An illustration of the general sequence of the ArbESC+ system compilation, starting with the original sentence and hypotheses. The process then involves aligning, extracting modifications, evaluating them through the trained model, and applying the accepted modifications to generate the final corrected sentence..
  • Figure 3: Best vs. ARBESC+ by ARETA class — QALB-2014.
  • Figure 4: Best vs. ARBESC+ by ARETA class — QALB-2015 L1.
  • Figure 5: Best vs. ARBESC+ by ARETA class — QALB-2015 L2.
  • ...and 4 more figures