Integration of LLM Quality Assurance into an NLG System
Ching-Yi Chen, Johanna Heininger, Adela Schneider, Christian Eckard, Andreas Madsack, Robert Weißgraeber
TL;DR
This work addresses scalable quality assurance for NLG outputs by incorporating a large language model to detect and correct grammar and spelling errors while tracing error sources to the underlying rule-based generation. It implements a human-in-the-loop framework where the LLM suggests edits and a human editor approves or rejects them, with corrections potentially informing future text generation. The evaluation on multilingual basketball reports (English→French, German, Spanish) shows high precision and strong suggestion quality but language-dependent recall and authenticity challenges, underscoring the need for language-specific prompts and cautious reliance on automated revisions. Overall, the paper demonstrates the practicality of integrating LLM-driven QA into NLG pipelines and outlines concrete directions for improving robustness and expanding QA dimensions.
Abstract
In this paper, we present a system that uses a Large Language Model (LLM) to perform grammar and spelling correction as a component of Quality Assurance (QA) for texts generated by NLG systems, which is important for text production in real-world scenarios. Evaluating the results of the system on work-in-progress sports news texts in three languages, we show that it is able to deliver acceptable corrections.
