Retriv at BLP-2025 Task 1: A Transformer Ensemble and Multi-Task Learning Approach for Bangla Hate Speech Identification

Sourav Saha; K M Nafi Asib; Mohammed Moshiul Hoque

Retriv at BLP-2025 Task 1: A Transformer Ensemble and Multi-Task Learning Approach for Bangla Hate Speech Identification

Sourav Saha, K M Nafi Asib, Mohammed Moshiul Hoque

TL;DR

This work tackles fine-grained Bangla hate speech detection in low-resource settings by combining transformer ensembles for hate-type and target prediction with a weighted multitask framework for joint type, severity, and target classification. Three Bangla-capable transformers (BanglaBERT, MuRIL, IndicBERTv2) form the backbone, with soft voting driving 1A/1B and a weighted multitask ensemble driving 1C, using development-set performance to weight contributions. Across subtasks, ensembles outperform baselines and single models, with systematic error analyses revealing the impact of label imbalance and the challenges of short, figurative, or sarcastic text. The results demonstrate the feasibility and usefulness of ensemble and multitask strategies for fine-grained hate speech detection in Bangla, offering a practical path for improving low-resource NLP systems and informing future cross-lingual and data-augmentation efforts.

Abstract

This paper addresses the problem of Bangla hate speech identification, a socially impactful yet linguistically challenging task. As part of the "Bangla Multi-task Hate Speech Identification" shared task at the BLP Workshop, IJCNLP-AACL 2025, our team "Retriv" participated in all three subtasks: (1A) hate type classification, (1B) target group identification, and (1C) joint detection of type, severity, and target. For subtasks 1A and 1B, we employed a soft-voting ensemble of transformer models (BanglaBERT, MuRIL, IndicBERTv2). For subtask 1C, we trained three multitask variants and aggregated their predictions through a weighted voting ensemble. Our systems achieved micro-f1 scores of 72.75% (1A) and 72.69% (1B), and a weighted micro-f1 score of 72.62% (1C). On the shared task leaderboard, these corresponded to 9th, 10th, and 7th positions, respectively. These results highlight the promise of transformer ensembles and weighted multitask frameworks for advancing Bangla hate speech detection in low-resource contexts. We made experimental scripts publicly available for the community.

Retriv at BLP-2025 Task 1: A Transformer Ensemble and Multi-Task Learning Approach for Bangla Hate Speech Identification

TL;DR

Abstract

Retriv at BLP-2025 Task 1: A Transformer Ensemble and Multi-Task Learning Approach for Bangla Hate Speech Identification

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)