Table of Contents
Fetching ...

UPLME: Uncertainty-Aware Probabilistic Language Modelling for Robust Empathy Regression

Md Rakibul Hasan, Md Zakir Hossain, Aneesh Krishna, Shafin Rahman, Tom Gedeon

TL;DR

This work tackles the problem of noisy self‑reported empathy labels in regression tasks by introducing UPLME, an uncertainty‑aware probabilistic language modeling framework that predicts both empathy scores and input‑dependent uncertainty. The method uses a cross‑encoder backbone with two parallel regression heads and employs variational model ensembling (Monte Carlo dropout) to capture epistemic uncertainty, along with three losses: a beta‑NLL objective, a variance penalty, and an alignment loss that ties representations of paired texts to their empathic similarity. Empirical results on NewsEmp21, NewsEmp24, and EmpStories show state‑of‑the‑art regression performance and superior uncertainty calibration compared with recent baselines, including UCVME, without requiring external data cleaning or dual‑model consistency constraints. The approach demonstrates that explicit uncertainty modeling can effectively downweight noisy labels, improve calibration, and reveal a meaningful latent structure related to empathy, with practical implications for robust NLP in social‑psychology tasks.

Abstract

Noisy self-reported empathy scores challenge supervised learning for empathy regression. While many algorithms have been proposed for learning with noisy labels in textual classification problems, the regression counterpart is relatively under-explored. We propose UPLME, an uncertainty-aware probabilistic language modelling framework to capture label noise in empathy regression tasks. One of the novelties in UPLME is a probabilistic language model that predicts both empathy scores and heteroscedastic uncertainty, and is trained using Bayesian concepts with variational model ensembling. We further introduce two novel loss components: one penalises degenerate Uncertainty Quantification (UQ), and another enforces similarity between the input pairs on which empathy is being predicted. UPLME achieves state-of-the-art performance (Pearson Correlation Coefficient: $0.558\rightarrow0.580$ and $0.629\rightarrow0.634$) in terms of the performance reported in the literature on two public benchmarks with label noise. Through synthetic label noise injection, we demonstrate that UPLME is effective in distinguishing between noisy and clean samples based on the predicted uncertainty. UPLME further outperform (Calibration error: $0.571\rightarrow0.376$) a recent variational model ensembling-based UQ method designed for regression problems. Code is publicly available at https://github.com/hasan-rakibul/UPLME.

UPLME: Uncertainty-Aware Probabilistic Language Modelling for Robust Empathy Regression

TL;DR

This work tackles the problem of noisy self‑reported empathy labels in regression tasks by introducing UPLME, an uncertainty‑aware probabilistic language modeling framework that predicts both empathy scores and input‑dependent uncertainty. The method uses a cross‑encoder backbone with two parallel regression heads and employs variational model ensembling (Monte Carlo dropout) to capture epistemic uncertainty, along with three losses: a beta‑NLL objective, a variance penalty, and an alignment loss that ties representations of paired texts to their empathic similarity. Empirical results on NewsEmp21, NewsEmp24, and EmpStories show state‑of‑the‑art regression performance and superior uncertainty calibration compared with recent baselines, including UCVME, without requiring external data cleaning or dual‑model consistency constraints. The approach demonstrates that explicit uncertainty modeling can effectively downweight noisy labels, improve calibration, and reveal a meaningful latent structure related to empathy, with practical implications for robust NLP in social‑psychology tasks.

Abstract

Noisy self-reported empathy scores challenge supervised learning for empathy regression. While many algorithms have been proposed for learning with noisy labels in textual classification problems, the regression counterpart is relatively under-explored. We propose UPLME, an uncertainty-aware probabilistic language modelling framework to capture label noise in empathy regression tasks. One of the novelties in UPLME is a probabilistic language model that predicts both empathy scores and heteroscedastic uncertainty, and is trained using Bayesian concepts with variational model ensembling. We further introduce two novel loss components: one penalises degenerate Uncertainty Quantification (UQ), and another enforces similarity between the input pairs on which empathy is being predicted. UPLME achieves state-of-the-art performance (Pearson Correlation Coefficient: and ) in terms of the performance reported in the literature on two public benchmarks with label noise. Through synthetic label noise injection, we demonstrate that UPLME is effective in distinguishing between noisy and clean samples based on the predicted uncertainty. UPLME further outperform (Calibration error: ) a recent variational model ensembling-based UQ method designed for regression problems. Code is publicly available at https://github.com/hasan-rakibul/UPLME.

Paper Structure

This paper contains 26 sections, 8 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: (a) Demonstration of empathy regression tasks and (b) Uncertainty based on our proposed penalty loss capturing label noise by ensuring higher uncertainty estimation on noisy samples and lower on clean samples.
  • Figure 2: Overview of our proposed UPLME framework. Input pairs are first concatenated and fed to a probabilistic Pre-trained Language Model (PLM) that predicts both empathy scores and heteroscedastic uncertainty. Uncertainty quantification is stabilised through variational model ensembling that includes multiple forward passes through the same model.
  • Figure 3: Our proposed penalty loss component ensures predicted uncertainty captures label noise. Left and Middle: comparison of predictive uncertainty between noisy and clean samples, showing that UPLME estimates higher uncertainty for noisy samples. Right: relationship between absolute prediction error and estimated uncertainty on both noisy and clean subsets, showing strong statistically-significant positive correlation (Spearman's $\rho = 0.72$).
  • Figure 4: 3D t‑SNE (left) and UMAP (right) projections of UPLME's learned representations on the NewsEmp training set. Each point corresponds to an input essay and is coloured by its ground-truth empathy score (1–7).