Table of Contents
Fetching ...

Harnessing the Power of Beta Scoring in Deep Active Learning for Multi-Label Text Classification

Wei Tan, Ngoc Dang Nguyen, Lan Du, Wray Buntine

TL;DR

The paper tackles data scarcity and severe label imbalance in multi-label text classification by introducing BESRA, a Beta distribution-based scoring framework embedded in the ELR active-learning paradigm. BESRA uses per-label Beta scoring rules to quantify expected score changes and selects informative, diverse samples via ensemble Bayesian updates, demonstrated across TextCNN, TextRNN, and BERT backbones on synthetic and six real MLTD benchmarks. The approach generalizes the prior BEMPS framework, provides provable convergence, and yields robust, domain-agnostic gains over a suite of seven baselines, particularly under imbalanced label distributions. These results indicate that tunable, imbalance-aware Beta scoring rules can substantially improve data efficiency in deep active learning for MLTC, with practical implications for high-stakes domains and diverse architectures.

Abstract

Within the scope of natural language processing, the domain of multi-label text classification is uniquely challenging due to its expansive and uneven label distribution. The complexity deepens due to the demand for an extensive set of annotated data for training an advanced deep learning model, especially in specialized fields where the labeling task can be labor-intensive and often requires domain-specific knowledge. Addressing these challenges, our study introduces a novel deep active learning strategy, capitalizing on the Beta family of proper scoring rules within the Expected Loss Reduction framework. It computes the expected increase in scores using the Beta Scoring Rules, which are then transformed into sample vector representations. These vector representations guide the diverse selection of informative samples, directly linking this process to the model's expected proper score. Comprehensive evaluations across both synthetic and real datasets reveal our method's capability to often outperform established acquisition techniques in multi-label text classification, presenting encouraging outcomes across various architectural and dataset scenarios.

Harnessing the Power of Beta Scoring in Deep Active Learning for Multi-Label Text Classification

TL;DR

The paper tackles data scarcity and severe label imbalance in multi-label text classification by introducing BESRA, a Beta distribution-based scoring framework embedded in the ELR active-learning paradigm. BESRA uses per-label Beta scoring rules to quantify expected score changes and selects informative, diverse samples via ensemble Bayesian updates, demonstrated across TextCNN, TextRNN, and BERT backbones on synthetic and six real MLTD benchmarks. The approach generalizes the prior BEMPS framework, provides provable convergence, and yields robust, domain-agnostic gains over a suite of seven baselines, particularly under imbalanced label distributions. These results indicate that tunable, imbalance-aware Beta scoring rules can substantially improve data efficiency in deep active learning for MLTC, with practical implications for high-stakes domains and diverse architectures.

Abstract

Within the scope of natural language processing, the domain of multi-label text classification is uniquely challenging due to its expansive and uneven label distribution. The complexity deepens due to the demand for an extensive set of annotated data for training an advanced deep learning model, especially in specialized fields where the labeling task can be labor-intensive and often requires domain-specific knowledge. Addressing these challenges, our study introduces a novel deep active learning strategy, capitalizing on the Beta family of proper scoring rules within the Expected Loss Reduction framework. It computes the expected increase in scores using the Beta Scoring Rules, which are then transformed into sample vector representations. These vector representations guide the diverse selection of informative samples, directly linking this process to the model's expected proper score. Comprehensive evaluations across both synthetic and real datasets reveal our method's capability to often outperform established acquisition techniques in multi-label text classification, presenting encouraging outcomes across various architectural and dataset scenarios.
Paper Structure (17 sections, 3 equations, 6 figures, 2 tables, 1 algorithm)

This paper contains 17 sections, 3 equations, 6 figures, 2 tables, 1 algorithm.

Figures (6)

  • Figure 1: The graph depicts the Expected Score gp (green) and Scoring Functions from Eq (\ref{['eq-Sbeta']}) for the Beta family in blue (sp(., 0) when $y=0$ and orange (sp(., 1) when $y=1$). It covers six scenarios: three specific to Brier Score, Logarithmic Score, and total error approximations, and three emphasizing asymmetry with varied Beta values.
  • Figure 2: The average Micro F1-score of AL models with acquisition size 100 on BERT, which were run with 5 different random seeds on various synthetic datasets.
  • Figure 3: The average Micro F1-score of AL models with acquisition size 100 on BERT, which were run with 5 different random seeds on various datasets.
  • Figure 4: The average Micro F1-score of AL models with acquisition size 100 on TextCNN and TextRNN, which were run with 5 different random seeds on RCV1.
  • Figure 5: The average Micro F1-score of AL models with acquisition size 100 on BERT, which were run with 5 different random seeds on for Bibtex and Yahoo (health).
  • ...and 1 more figures