Table of Contents
Fetching ...

Not All Pretraining are Created Equal: Threshold Tuning and Class Weighting for Imbalanced Polarization Tasks in Low-Resource Settings

Abass Oguntade

Abstract

This paper describes my submission to the Polarization Shared Task at SemEval-2025, which addresses polarization detection and classification in social media text. I develop Transformer-based systems for English and Swahili across three subtasks: binary polarization detection, multi-label target type classification, and multi-label manifestation identification. The approach leverages multilingual and African language-specialized models (mDeBERTa-v3-base, SwahBERT, AfriBERTa-large), class-weighted loss functions, iterative stratified data splitting, and per-label threshold tuning to handle severe class imbalance. The best configuration, mDeBERTa-v3-base, achieves 0.8032 macro-F1 on validation for binary detection, with competitive performance on multi-label tasks (up to 0.556 macro-F1). Error analysis reveals persistent challenges with implicit polarization, code-switching, and distinguishing heated political discourse from genuine polarization.

Not All Pretraining are Created Equal: Threshold Tuning and Class Weighting for Imbalanced Polarization Tasks in Low-Resource Settings

Abstract

This paper describes my submission to the Polarization Shared Task at SemEval-2025, which addresses polarization detection and classification in social media text. I develop Transformer-based systems for English and Swahili across three subtasks: binary polarization detection, multi-label target type classification, and multi-label manifestation identification. The approach leverages multilingual and African language-specialized models (mDeBERTa-v3-base, SwahBERT, AfriBERTa-large), class-weighted loss functions, iterative stratified data splitting, and per-label threshold tuning to handle severe class imbalance. The best configuration, mDeBERTa-v3-base, achieves 0.8032 macro-F1 on validation for binary detection, with competitive performance on multi-label tasks (up to 0.556 macro-F1). Error analysis reveals persistent challenges with implicit polarization, code-switching, and distinguishing heated political discourse from genuine polarization.
Paper Structure (26 sections, 3 equations, 19 figures, 2 tables, 1 algorithm)

This paper contains 26 sections, 3 equations, 19 figures, 2 tables, 1 algorithm.

Figures (19)

  • Figure 1: English dataset polarization class distribution for Subtask 1. The dataset shows moderate imbalance with 64% non-polarized instances.
  • Figure 2: Swahili dataset polarization class distribution for Subtask 1. Nearly balanced with 50% representation in each class.
  • Figure 3: Word cloud for English dataset showing dominance of political terms (trump, ukraine, gaza) and geopolitical discourse markers.
  • Figure 4: Word cloud for Swahili dataset revealing frequent vulgar and offensive terms indicating different polarization expression patterns.
  • Figure 5: Subtask 2 label distribution across five target types showing severe imbalance. Political dominates while gender/sexual remains extremely rare.
  • ...and 14 more figures