Gradient Masters at BLP-2025 Task 1: Advancing Low-Resource NLP for Bengali using Ensemble-Based Adversarial Training for Hate Speech Detection
Syed Mohaiminul Hoque, Naimur Rahman, Md Sakhawat Hossain
TL;DR
The paper tackles fine-grained Bangla hate speech detection in a low-resource setting by deploying an ensemble-based fine-tuning pipeline that leverages orthography normalization and FGSM adversarial training for two subtasks: hate-type classification (1A) and target-group classification (1B). Transformer-based models paired with K-fold cross-validation and normalization achieve strong performance, with MuRIL-large+KF attaining Dev 74.96% and Test 73.44% on Subtask 1B, and BanglaBERT with KF+FGSM+N reaching Dev 74.88% and Test 72.33% on Subtask 1A, illustrating improved generalization under data imbalance. External data augmentation offered dev gains but did not improve test performance due to domain mismatch, underscoring the importance of data compatibility in low-resource regimes. Error analyses reveal persistent challenges for minority classes and notable confusion between None and targeted hate, guiding future directions in adversarial robustness for Bangla NLP. Overall, the approach demonstrates practical robustness for automatic moderation of Bangla YouTube comments and provides a scalable framework for low-resource hate speech detection using ensemble and adversarial techniques.
Abstract
This paper introduces the approach of "Gradient Masters" for BLP-2025 Task 1: "Bangla Multitask Hate Speech Identification Shared Task". We present an ensemble-based fine-tuning strategy for addressing subtasks 1A (hate-type classification) and 1B (target group classification) in YouTube comments. We propose a hybrid approach on a Bangla Language Model, which outperformed the baseline models and secured the 6th position in subtask 1A with a micro F1 score of 73.23% and the third position in subtask 1B with 73.28%. We conducted extensive experiments that evaluated the robustness of the model throughout the development and evaluation phases, including comparisons with other Language Model variants, to measure generalization in low-resource Bangla hate speech scenarios and data set coverage. In addition, we provide a detailed analysis of our findings, exploring misclassification patterns in the detection of hate speech.
