Table of Contents
Fetching ...

Uncertainty quantification in fine-tuned LLMs using LoRA ensembles

Oleksandr Balabanov, Hampus Linander

TL;DR

This work treats fine-tuning of large language models as a Bayesian update around a pre-trained prior and develops principled posterior approximations using ensembles of low-rank adapters (LoRA). By analyzing predictive entropy $H(t^*|s^*,\mathcal{D})$ and mutual information $\text{MI}(\theta,t^*|s^*,\mathcal{D})$ across LoRA ensemble members, the authors quantify epistemic versus aleatoric uncertainty and track how knowledge from the pre-trained model is retained or replaced during domain adaptation. They implement LoRA deep ensembles on Mistral-7B fine-tuned with CommonsenseQA and evaluate on CQA, MMLU STEM, and MMLU SS, showing that small ensemble sizes ($M=5$) yield nearly as good posterior quality as larger ones while reducing overfitting and enabling detection of uncertain predictions. The findings reveal distinct uncertainty dynamics across in-domain and out-of-domain data, including unexpected retention of acquired knowledge in the overfitting regime, and demonstrate a practical framework for uncertainty-aware deployment and active learning in fine-tuned LLMs.

Abstract

Fine-tuning large language models can improve task specific performance, although a general understanding of what the fine-tuned model has learned, forgotten and how to trust its predictions is still missing. We derive principled uncertainty quantification for fine-tuned LLMs with posterior approximations using computationally efficient low-rank adaptation ensembles. We analyze three common multiple-choice datasets using low-rank adaptation ensembles based on Mistral-7b, and draw quantitative and qualitative conclusions on their perceived complexity and balance between retained prior knowledge and domain specific adaptation during and after fine-tuning. We identify unexpected retention of acquired knowledge during fine-tuning in the overfitting regime.

Uncertainty quantification in fine-tuned LLMs using LoRA ensembles

TL;DR

This work treats fine-tuning of large language models as a Bayesian update around a pre-trained prior and develops principled posterior approximations using ensembles of low-rank adapters (LoRA). By analyzing predictive entropy and mutual information across LoRA ensemble members, the authors quantify epistemic versus aleatoric uncertainty and track how knowledge from the pre-trained model is retained or replaced during domain adaptation. They implement LoRA deep ensembles on Mistral-7B fine-tuned with CommonsenseQA and evaluate on CQA, MMLU STEM, and MMLU SS, showing that small ensemble sizes () yield nearly as good posterior quality as larger ones while reducing overfitting and enabling detection of uncertain predictions. The findings reveal distinct uncertainty dynamics across in-domain and out-of-domain data, including unexpected retention of acquired knowledge in the overfitting regime, and demonstrate a practical framework for uncertainty-aware deployment and active learning in fine-tuned LLMs.

Abstract

Fine-tuning large language models can improve task specific performance, although a general understanding of what the fine-tuned model has learned, forgotten and how to trust its predictions is still missing. We derive principled uncertainty quantification for fine-tuned LLMs with posterior approximations using computationally efficient low-rank adaptation ensembles. We analyze three common multiple-choice datasets using low-rank adaptation ensembles based on Mistral-7b, and draw quantitative and qualitative conclusions on their perceived complexity and balance between retained prior knowledge and domain specific adaptation during and after fine-tuning. We identify unexpected retention of acquired knowledge during fine-tuning in the overfitting regime.
Paper Structure (30 sections, 5 equations, 4 figures, 1 table)

This paper contains 30 sections, 5 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Performance of LoRA ensembles trained on CQA and evaluated on the CQA, MMLU STEM, and MMLU SS datasets. Metrics (computed over the ensemble mean distribution) include accuracy, negative log-likelihood (NLL) loss, and expected calibration error (ECE). We show ensemble sizes $M \in \{1,5,10,20\}$; for $M=1,5,10$, results are averaged over 20, 4, and 2 independent realizations, respectively, while $M=20$ uses a single run. Metric values at epoch 1 (underfitting), epoch 3 (optimal), and epoch 6 (overfitting) are highlighted.
  • Figure 2: In-distribution evolution of uncertainty during fine-tuning. Histograms of predictive entropy and mutual information for a LoRA ensemble trained and evaluated on the CQA dataset. The ensemble consists of $M=5$ members, with a LoRA rank of 8, $\alpha=32$, and an L2 LoRA loss of 1. This figure illustrates the evolution of uncertainty measures for in-domain data across training epochs (columns), differentiated by correct (top row) and incorrect (bottom row) predictions. We also depict the corresponding mean (red) and median (green) entropy and mutual information values. Color represents the number count in each bin.
  • Figure 3: Out-of-distribution evolution of uncertainty during fine-tuning. Histograms of predictive entropy (Entropy) and mutual information (MI) for a LoRA ensemble trained on CQA and evaluated on MMLU STEM (top rows) and MMLU SS (bottom rows) datasets. The ensemble consists of $M=5$ members, with a LoRA rank of 8, $\alpha=32$, and an L2 LoRA loss of 1. These panels illustrates the evolution of uncertainty measures for out-of-training-domain data across training epochs (columns left to right), differentiated by correct (first and third row) and incorrect (second and forth row) predictions. We also depict the corresponding mean (red) and median (green) entropy and mutual information values. Color represents the number count in each bin.
  • Figure 4: Late overfitting regime (epoch 10). Histograms of predictive entropy (Entropy) and mutual information (MI) for a LoRA ensemble trained on CQA (left column) and evaluated on MMLU SS (middle column) and MMLU STEM (right column) datasets. The ensemble consists of $M=5$ members, with a LoRA rank of 8, $\alpha=32$, and an L2 LoRA loss of 1. There is still a significant fraction of samples (lower right panel) in MMLU STEM with low mutual information and high total uncertainty.