HateTinyLLM : Hate Speech Detection Using Tiny Large Language Models

Tanmay Sen; Ansuman Das; Mrinmay Sen

HateTinyLLM : Hate Speech Detection Using Tiny Large Language Models

Tanmay Sen, Ansuman Das, Mrinmay Sen

TL;DR

This paper tackles efficient hate speech detection by leveraging fine-tuned decoder-only tiny LLMs, introducing HateTinyLLM and evaluating LoRA- and adapter-based fine-tuning across multiple tiny models. The approach demonstrates that small, resource-friendly LLMs can surpass pretrained baselines (e.g., Mixtral-7B) with strong performance on English datasets such as DynaHate and HateEval, with LoRA providing the most substantial gains (accuracy above $0.80$ on both datasets, notably for OPT-1.3B: Dynahate $0.82$, HateEval $0.80$). The study highlights the practicality of LoRA for parameter-efficient adaptation of tiny LLMs and suggests strong potential for real-world deployment in content moderation, while also pointing to future work in expanding fine-tuning techniques and multilingual generalization. Overall, HateTinyLLM establishes that decoder-only tiny LLMs, when properly fine-tuned, can deliver competitive hate speech detection performance with reduced compute requirements.

Abstract

Hate speech encompasses verbal, written, or behavioral communication that targets derogatory or discriminatory language against individuals or groups based on sensitive characteristics. Automated hate speech detection plays a crucial role in curbing its propagation, especially across social media platforms. Various methods, including recent advancements in deep learning, have been devised to address this challenge. In this study, we introduce HateTinyLLM, a novel framework based on fine-tuned decoder-only tiny large language models (tinyLLMs) for efficient hate speech detection. Our experimental findings demonstrate that the fine-tuned HateTinyLLM outperforms the pretrained mixtral-7b model by a significant margin. We explored various tiny LLMs, including PY007/TinyLlama-1.1B-step-50K-105b, Microsoft/phi-2, and facebook/opt-1.3b, and fine-tuned them using LoRA and adapter methods. Our observations indicate that all LoRA-based fine-tuned models achieved over 80\% accuracy.

HateTinyLLM : Hate Speech Detection Using Tiny Large Language Models

TL;DR

on both datasets, notably for OPT-1.3B: Dynahate

, HateEval

). The study highlights the practicality of LoRA for parameter-efficient adaptation of tiny LLMs and suggests strong potential for real-world deployment in content moderation, while also pointing to future work in expanding fine-tuning techniques and multilingual generalization. Overall, HateTinyLLM establishes that decoder-only tiny LLMs, when properly fine-tuned, can deliver competitive hate speech detection performance with reduced compute requirements.

Abstract

Paper Structure (8 sections, 2 figures, 8 tables)

This paper contains 8 sections, 2 figures, 8 tables.

Introduction
Datasets
Methodology
Experiments, Results and Analysis
Baselines Setup
Experimental Setup and Hyperparameters
Results and Discussion
Conclusion and Future Work

Figures (2)

Figure 1: Adapter architecture
Figure 2: LoRA architecture

HateTinyLLM : Hate Speech Detection Using Tiny Large Language Models

TL;DR

Abstract

HateTinyLLM : Hate Speech Detection Using Tiny Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (2)