Table of Contents
Fetching ...

Classification of Hope in Textual Data using Transformer-Based Models

Chukwuebuka Fortunate Ijezue, Tania-Amanda Fredrick Eneye, Maaz Amjad

TL;DR

This work tackles automatic detection and fine-grained classification of hope expressions in text using transformer-based models. By comparing BERT, GPT-2, and DeBERTa on binary and five-class hope classification, the study reveals that BERT achieves the best accuracy-efficiency balance, while larger models incur higher costs without commensurate gains. Error analysis highlights challenges in contextual disambiguation, boundary between categories, and sarcasm detection, with GPT-2 showing a notable strength in sarcasm recall. The findings inform model selection and deployment considerations in affective computing applications, and point to future directions such as ensembles, domain-specific classifiers, and cross-linguistic analyses.

Abstract

This paper presents a transformer-based approach for classifying hope expressions in text. We developed and compared three architectures (BERT, GPT-2, and DeBERTa) for both binary classification (Hope vs. Not Hope) and multiclass categorization (five hope-related categories). Our initial BERT implementation achieved 83.65% binary and 74.87% multiclass accuracy. In the extended comparison, BERT demonstrated superior performance (84.49% binary, 72.03% multiclass accuracy) while requiring significantly fewer computational resources (443s vs. 704s training time) than newer architectures. GPT-2 showed lowest overall accuracy (79.34% binary, 71.29% multiclass), while DeBERTa achieved moderate results (80.70% binary, 71.56% multiclass) but at substantially higher computational cost (947s for multiclass training). Error analysis revealed architecture-specific strengths in detecting nuanced hope expressions, with GPT-2 excelling at sarcasm detection (92.46% recall). This study provides a framework for computational analysis of hope, with applications in mental health and social media analysis, while demonstrating that architectural suitability may outweigh model size for specialized emotion detection tasks.

Classification of Hope in Textual Data using Transformer-Based Models

TL;DR

This work tackles automatic detection and fine-grained classification of hope expressions in text using transformer-based models. By comparing BERT, GPT-2, and DeBERTa on binary and five-class hope classification, the study reveals that BERT achieves the best accuracy-efficiency balance, while larger models incur higher costs without commensurate gains. Error analysis highlights challenges in contextual disambiguation, boundary between categories, and sarcasm detection, with GPT-2 showing a notable strength in sarcasm recall. The findings inform model selection and deployment considerations in affective computing applications, and point to future directions such as ensembles, domain-specific classifiers, and cross-linguistic analyses.

Abstract

This paper presents a transformer-based approach for classifying hope expressions in text. We developed and compared three architectures (BERT, GPT-2, and DeBERTa) for both binary classification (Hope vs. Not Hope) and multiclass categorization (five hope-related categories). Our initial BERT implementation achieved 83.65% binary and 74.87% multiclass accuracy. In the extended comparison, BERT demonstrated superior performance (84.49% binary, 72.03% multiclass accuracy) while requiring significantly fewer computational resources (443s vs. 704s training time) than newer architectures. GPT-2 showed lowest overall accuracy (79.34% binary, 71.29% multiclass), while DeBERTa achieved moderate results (80.70% binary, 71.56% multiclass) but at substantially higher computational cost (947s for multiclass training). Error analysis revealed architecture-specific strengths in detecting nuanced hope expressions, with GPT-2 excelling at sarcasm detection (92.46% recall). This study provides a framework for computational analysis of hope, with applications in mental health and social media analysis, while demonstrating that architectural suitability may outweigh model size for specialized emotion detection tasks.

Paper Structure

This paper contains 34 sections, 9 figures, 1 table.

Figures (9)

  • Figure 1: Accuracy comparison across models for binary and multiclass hope classification tasks.
  • Figure 2: Training time comparison across models.
  • Figure 3: Trade-off comparison between model size, training time, and classification accuracy for binary and multiclass classification tasks. Bubble size represents model size.
  • Figure 4: BERT Binary
  • Figure 5: GPT-2 Binary
  • ...and 4 more figures