LLMsAgainstHate @ NLU of Devanagari Script Languages 2025: Hate Speech Detection and Target Identification in Devanagari Languages via Parameter Efficient Fine-Tuning of LLMs
Rushendra Sidibomma, Pransh Patwa, Parth Patwa, Aman Chadha, Vinija Jain, Amitava Das
TL;DR
This work tackles hate speech detection and targeted hate identification in Devanagari-script languages (Hindi and Nepali), a low-resource setting. It proposes a parameter-efficient fine-tuning framework using LoRA to adapt multiple LLMs for two tasks on the CHiPSAL dataset, with 4-bit quantization to reduce resource use. Nemo consistently yields the strongest performance (F1 ≈ 90.05% for hate speech detection and ≈ 71.47% for target identification), though minority classes remain challenging due to data imbalance. The study demonstrates the viability of PEFT for Devanagari content and outlines avenues for improvement, such as data augmentation and ensemble strategies to enhance robustness.
Abstract
The detection of hate speech has become increasingly important in combating online hostility and its real-world consequences. Despite recent advancements, there is limited research addressing hate speech detection in Devanagari-scripted languages, where resources and tools are scarce. While large language models (LLMs) have shown promise in language-related tasks, traditional fine-tuning approaches are often infeasible given the size of the models. In this paper, we propose a Parameter Efficient Fine tuning (PEFT) based solution for hate speech detection and target identification. We evaluate multiple LLMs on the Devanagari dataset provided by (Thapa et al., 2025), which contains annotated instances in 2 languages - Hindi and Nepali. The results demonstrate the efficacy of our approach in handling Devanagari-scripted content.
