"When Data is Scarce, Prompt Smarter"... Approaches to Grammatical Error Correction in Low-Resource Settings
Somsubhra De, Harsh Kumar, Arun Prakash A
TL;DR
Framing GEC for low-resource Indic languages as a sequence-to-sequence task, the study compares zero-shot and few-shot prompting of state-of-the-art LLMs (GPT-4.1, Gemini-2.5, LLaMA-4) against a LoRA-finetuned Sarvam-M 24B Hindi model on the five-language Indic-GEC dataset from Bhasha. Prompting-based approaches generally outperform the fine-tuned baseline, achieving leading GLEU scores in Tamil and Hindi and competitive results in Malayalam, Bengali, and Telugu, while tokenizer choices significantly impact evaluation reliability. The work highlights the strong multilingual generalization of modern LLMs for GEC and demonstrates that careful prompt design and lightweight adaptation can bridge resource gaps, though tokenization and cross-lingual transfer remain key areas for improvement. It also proposes future directions including cross-lingual transfer, multilingual joint fine-tuning, and tokenizer optimization to further enhance grammatical correction across diverse Indic scripts.
Abstract
Grammatical error correction (GEC) is an important task in Natural Language Processing that aims to automatically detect and correct grammatical mistakes in text. While recent advances in transformer-based models and large annotated datasets have greatly improved GEC performance for high-resource languages such as English, the progress has not extended equally. For most Indic languages, GEC remains a challenging task due to limited resources, linguistic diversity and complex morphology. In this work, we explore prompting-based approaches using state-of-the-art large language models (LLMs), such as GPT-4.1, Gemini-2.5 and LLaMA-4, combined with few-shot strategy to adapt them to low-resource settings. We observe that even basic prompting strategies, such as zero-shot and few-shot approaches, enable these LLMs to substantially outperform fine-tuned Indic-language models like Sarvam-22B, thereby illustrating the exceptional multilingual generalization capabilities of contemporary LLMs for GEC. Our experiments show that carefully designed prompts and lightweight adaptation significantly enhance correction quality across multiple Indic languages. We achieved leading results in the shared task--ranking 1st in Tamil (GLEU: 91.57) and Hindi (GLEU: 85.69), 2nd in Telugu (GLEU: 85.22), 4th in Bangla (GLEU: 92.86), and 5th in Malayalam (GLEU: 92.97). These findings highlight the effectiveness of prompt-driven NLP techniques and underscore the potential of large-scale LLMs to bridge resource gaps in multilingual GEC.
