Table of Contents
Fetching ...

AdUE: Improving uncertainty estimation head for LoRA adapters in LLMs

Artem Zabolotnyi, Roman Makarov, Mile Mitrovic, Polina Proskura, Oleg Travkin, Roman Alferov, Alexey Zaytsev

TL;DR

Uncertainty estimation for adapter-based NLP models is prone to miscalibration. AdUE introduces a lightweight post-hoc uncertainty head that uses a differentiable SmoothMax surrogate to approximate the max over class probabilities and couples it with L2-SP regularization, trained after LoRA fine-tuning. The method minimizes a three-term loss $\mathcal{L} = \mathcal{L}_{\mathrm{BCE}} + \alpha \mathcal{L}_{\mathrm{reg}} + \beta \mathcal{L}_{\mathrm{L2SP}}$, with $\mathcal{L}_{\mathrm{reg}}$ matching the softmax-based baseline, and anchors parameters via $\mathcal{L}_{\mathrm{L2SP}}$; evaluation across five NLP datasets and four model families shows consistent ROC-AUC gains over Mahalanobis and Softmax Response baselines. This lightweight, no-base-model-change approach enhances calibration of uncertainty without sacrificing task performance, enabling more reliable deployment of parameter-efficient fine-tuned LLMs in risk-sensitive settings. $\mathrm{SmoothMax}$ is defined as $\mathrm{SmoothMax}(\mathbf{p}) = \frac{1}{\lambda} \log \sum_{i=1}^C e^{\lambda p_i}$, and $U^{\mathrm{AdUE}} = 1 - \mathrm{SmoothMax}(\mathbf{p})$ during inference.

Abstract

Uncertainty estimation remains a critical challenge in adapting pre-trained language models to classification tasks, particularly under parameter-efficient fine-tuning approaches such as adapters. We introduce AdUE1, an efficient post-hoc uncertainty estimation (UE) method, to enhance softmax-based estimates. Our approach (1) uses a differentiable approximation of the maximum function and (2) applies additional regularization through L2-SP, anchoring the fine-tuned head weights and regularizing the model. Evaluations on five NLP classification datasets across four language models (RoBERTa, ELECTRA, LLaMA-2, Qwen) demonstrate that our method consistently outperforms established baselines such as Mahalanobis distance and softmax response. Our approach is lightweight (no base-model changes) and produces better-calibrated confidence.

AdUE: Improving uncertainty estimation head for LoRA adapters in LLMs

TL;DR

Uncertainty estimation for adapter-based NLP models is prone to miscalibration. AdUE introduces a lightweight post-hoc uncertainty head that uses a differentiable SmoothMax surrogate to approximate the max over class probabilities and couples it with L2-SP regularization, trained after LoRA fine-tuning. The method minimizes a three-term loss , with matching the softmax-based baseline, and anchors parameters via ; evaluation across five NLP datasets and four model families shows consistent ROC-AUC gains over Mahalanobis and Softmax Response baselines. This lightweight, no-base-model-change approach enhances calibration of uncertainty without sacrificing task performance, enabling more reliable deployment of parameter-efficient fine-tuned LLMs in risk-sensitive settings. is defined as , and during inference.

Abstract

Uncertainty estimation remains a critical challenge in adapting pre-trained language models to classification tasks, particularly under parameter-efficient fine-tuning approaches such as adapters. We introduce AdUE1, an efficient post-hoc uncertainty estimation (UE) method, to enhance softmax-based estimates. Our approach (1) uses a differentiable approximation of the maximum function and (2) applies additional regularization through L2-SP, anchoring the fine-tuned head weights and regularizing the model. Evaluations on five NLP classification datasets across four language models (RoBERTa, ELECTRA, LLaMA-2, Qwen) demonstrate that our method consistently outperforms established baselines such as Mahalanobis distance and softmax response. Our approach is lightweight (no base-model changes) and produces better-calibrated confidence.

Paper Structure

This paper contains 29 sections, 10 equations, 1 figure, 6 tables.

Figures (1)

  • Figure 1: $U^{\text{AdUE}}$ head training scheme. We initialize the new uncertainty head with the original classifier’s weights $\theta_{init}$ and fine-tune it with a three-term loss (binary CE, softmax-regularization, L2-SP). The hard max is replaced by a differentiable SmoothMax during training