Table of Contents
Fetching ...

LLMsAgainstHate @ NLU of Devanagari Script Languages 2025: Hate Speech Detection and Target Identification in Devanagari Languages via Parameter Efficient Fine-Tuning of LLMs

Rushendra Sidibomma, Pransh Patwa, Parth Patwa, Aman Chadha, Vinija Jain, Amitava Das

TL;DR

This work tackles hate speech detection and targeted hate identification in Devanagari-script languages (Hindi and Nepali), a low-resource setting. It proposes a parameter-efficient fine-tuning framework using LoRA to adapt multiple LLMs for two tasks on the CHiPSAL dataset, with 4-bit quantization to reduce resource use. Nemo consistently yields the strongest performance (F1 ≈ 90.05% for hate speech detection and ≈ 71.47% for target identification), though minority classes remain challenging due to data imbalance. The study demonstrates the viability of PEFT for Devanagari content and outlines avenues for improvement, such as data augmentation and ensemble strategies to enhance robustness.

Abstract

The detection of hate speech has become increasingly important in combating online hostility and its real-world consequences. Despite recent advancements, there is limited research addressing hate speech detection in Devanagari-scripted languages, where resources and tools are scarce. While large language models (LLMs) have shown promise in language-related tasks, traditional fine-tuning approaches are often infeasible given the size of the models. In this paper, we propose a Parameter Efficient Fine tuning (PEFT) based solution for hate speech detection and target identification. We evaluate multiple LLMs on the Devanagari dataset provided by (Thapa et al., 2025), which contains annotated instances in 2 languages - Hindi and Nepali. The results demonstrate the efficacy of our approach in handling Devanagari-scripted content.

LLMsAgainstHate @ NLU of Devanagari Script Languages 2025: Hate Speech Detection and Target Identification in Devanagari Languages via Parameter Efficient Fine-Tuning of LLMs

TL;DR

This work tackles hate speech detection and targeted hate identification in Devanagari-script languages (Hindi and Nepali), a low-resource setting. It proposes a parameter-efficient fine-tuning framework using LoRA to adapt multiple LLMs for two tasks on the CHiPSAL dataset, with 4-bit quantization to reduce resource use. Nemo consistently yields the strongest performance (F1 ≈ 90.05% for hate speech detection and ≈ 71.47% for target identification), though minority classes remain challenging due to data imbalance. The study demonstrates the viability of PEFT for Devanagari content and outlines avenues for improvement, such as data augmentation and ensemble strategies to enhance robustness.

Abstract

The detection of hate speech has become increasingly important in combating online hostility and its real-world consequences. Despite recent advancements, there is limited research addressing hate speech detection in Devanagari-scripted languages, where resources and tools are scarce. While large language models (LLMs) have shown promise in language-related tasks, traditional fine-tuning approaches are often infeasible given the size of the models. In this paper, we propose a Parameter Efficient Fine tuning (PEFT) based solution for hate speech detection and target identification. We evaluate multiple LLMs on the Devanagari dataset provided by (Thapa et al., 2025), which contains annotated instances in 2 languages - Hindi and Nepali. The results demonstrate the efficacy of our approach in handling Devanagari-scripted content.

Paper Structure

This paper contains 12 sections, 1 equation, 2 figures, 6 tables.

Figures (2)

  • Figure 1: Confusion matrix of Nemo on the test set for hate speech detection.
  • Figure 2: Confusion matrix of Nemo on the test set for hate speech target identification.