Exploring Ordinality in Text Classification: A Comparative Study of Explicit and Implicit Techniques
Siva Rajesh Kasa, Aniket Goel, Karan Gupta, Sumegh Roychowdhury, Anish Bhanushali, Nikhil Pattisapu, Prasanna Srinivasa Murthy
TL;DR
This work tackles ordinal classification (OC) in NLP by comparing explicit ordinal-loss methods with implicit PLM-based approaches. It develops a unified framework around four properties—Proper Scoring Rule (PSR), Unimodality (UM), Convexity (Cx), and Ordinality (Ord)—to analyze explicit losses such as CE, OLL, SOFT, EMD, CORAL, WKL, and VS-SL, and introduces a hybrid Multi-task Log Loss (MLL) that blends CE and OLL via a tunable parameter $\lambda$ to balance nominal and ordinal metrics. On the implicit side, it investigates an encoder-based entailment-style approach using verbaliser templates and an image-like data-augmentation scheme, and a decoder-based generative approach with GPT-2 small and Llama-Adapter prompts, highlighting the role of informative versus uninformative verbalisers. Empirically, MLL offers balanced performance in high-data settings, ENT excels in few-shot scenarios with lower variance, and large decoder models (especially Llama-7B-Adapter) surpass others in full-data conditions, albeit with higher compute and hallucination risks. The paper provides practical recommendations for selecting OC strategies by data regime and emphasizes the importance of label semantics and unimodality as diagnostic signals for ordinal behavior.
Abstract
Ordinal Classification (OC) is a widely encountered challenge in Natural Language Processing (NLP), with applications in various domains such as sentiment analysis, rating prediction, and more. Previous approaches to tackle OC have primarily focused on modifying existing or creating novel loss functions that \textbf{explicitly} account for the ordinal nature of labels. However, with the advent of Pretrained Language Models (PLMs), it became possible to tackle ordinality through the \textbf{implicit} semantics of the labels as well. This paper provides a comprehensive theoretical and empirical examination of both these approaches. Furthermore, we also offer strategic recommendations regarding the most effective approach to adopt based on specific settings.
