Table of Contents
Fetching ...

Learning Conformal Abstention Policies for Adaptive Risk Management in Large Language and Vision-Language Models

Sina Tayebati, Divake Kumar, Nastaran Darabi, Dinithi Jayasuriya, Ranganath Krishnan, Amit Ranjan Trivedi

TL;DR

The paper addresses the rigidity of static conformal prediction thresholds in risk-sensitive LLM/VLM applications. It introduces Learnable Conformal Abstention (CAP), which couples reinforcement learning with conformal prediction to adapt two thresholds and produce single-label, set-valued, or abstention outputs, all while maintaining conformal coverage guarantees. Across ten MCQA benchmarks and diverse model families, CAP yields higher accuracy, better hallucination-detection reliability (AUROC), improved uncertainty-guided generation (AUARC), and significantly reduced calibration error (ECE), while preserving at least 90% coverage and producing more informative prediction sets. This approach offers a principled, scalable framework for robust decision-making in safety-critical foundation models, with code and results enabling reproducibility and further exploration in risk-management workflows.

Abstract

Large Language and Vision-Language Models (LLMs/VLMs) are increasingly used in safety-critical applications, yet their opaque decision-making complicates risk assessment and reliability. Uncertainty quantification (UQ) helps assess prediction confidence and enables abstention when uncertainty is high. Conformal prediction (CP), a leading UQ method, provides statistical guarantees but relies on static thresholds, which fail to adapt to task complexity and evolving data distributions, leading to suboptimal trade-offs in accuracy, coverage, and informativeness. To address this, we propose learnable conformal abstention, integrating reinforcement learning (RL) with CP to optimize abstention thresholds dynamically. By treating CP thresholds as adaptive actions, our approach balances multiple objectives, minimizing prediction set size while maintaining reliable coverage. Extensive evaluations across diverse LLM/VLM benchmarks show our method outperforms Least Ambiguous Classifiers (LAC) and Adaptive Prediction Sets (APS), improving accuracy by up to 3.2%, boosting AUROC for hallucination detection by 22.19%, enhancing uncertainty-guided selective generation (AUARC) by 21.17%, and reducing calibration error by 70%-85%. These improvements hold across multiple models and datasets while consistently meeting the 90% coverage target, establishing our approach as a more effective and flexible solution for reliable decision-making in safety-critical applications. The code is available at: {https://github.com/sinatayebati/vlm-uncertainty}.

Learning Conformal Abstention Policies for Adaptive Risk Management in Large Language and Vision-Language Models

TL;DR

The paper addresses the rigidity of static conformal prediction thresholds in risk-sensitive LLM/VLM applications. It introduces Learnable Conformal Abstention (CAP), which couples reinforcement learning with conformal prediction to adapt two thresholds and produce single-label, set-valued, or abstention outputs, all while maintaining conformal coverage guarantees. Across ten MCQA benchmarks and diverse model families, CAP yields higher accuracy, better hallucination-detection reliability (AUROC), improved uncertainty-guided generation (AUARC), and significantly reduced calibration error (ECE), while preserving at least 90% coverage and producing more informative prediction sets. This approach offers a principled, scalable framework for robust decision-making in safety-critical foundation models, with code and results enabling reproducibility and further exploration in risk-management workflows.

Abstract

Large Language and Vision-Language Models (LLMs/VLMs) are increasingly used in safety-critical applications, yet their opaque decision-making complicates risk assessment and reliability. Uncertainty quantification (UQ) helps assess prediction confidence and enables abstention when uncertainty is high. Conformal prediction (CP), a leading UQ method, provides statistical guarantees but relies on static thresholds, which fail to adapt to task complexity and evolving data distributions, leading to suboptimal trade-offs in accuracy, coverage, and informativeness. To address this, we propose learnable conformal abstention, integrating reinforcement learning (RL) with CP to optimize abstention thresholds dynamically. By treating CP thresholds as adaptive actions, our approach balances multiple objectives, minimizing prediction set size while maintaining reliable coverage. Extensive evaluations across diverse LLM/VLM benchmarks show our method outperforms Least Ambiguous Classifiers (LAC) and Adaptive Prediction Sets (APS), improving accuracy by up to 3.2%, boosting AUROC for hallucination detection by 22.19%, enhancing uncertainty-guided selective generation (AUARC) by 21.17%, and reducing calibration error by 70%-85%. These improvements hold across multiple models and datasets while consistently meeting the 90% coverage target, establishing our approach as a more effective and flexible solution for reliable decision-making in safety-critical applications. The code is available at: {https://github.com/sinatayebati/vlm-uncertainty}.

Paper Structure

This paper contains 19 sections, 1 theorem, 23 equations, 13 figures, 12 tables, 1 algorithm.

Key Result

Theorem 1

Let $\{(X_i, Y_i)\}_{i=1}^{n+1}$ be i.i.d. samples from an unknown distribution, partitioned into: Suppose a nonconformity score function$s(\cdot,\cdot)$ assigns a real-valued score $s(X_i, Y_i)$ to each calibration sample, capturing how "atypical" or "nonconforming" the pair $(X_i, Y_i)$ appears relative to a prediction model. Denoting let $\hat{q}$ be the $(1-\alpha)$-quantile of these calibra

Figures (13)

  • Figure 1: Accuracy vs. Expected Calibration Error (ECE) comparison of CAP, APS, and LAC across various VLMs and five datasets: MMBench, ScienceQA, OODCV, SEEDBench, and AI2D. An ideal model has high accuracy and low ECE (upper-left). ATCP shows significant ECE improvement over baselines. Please refer to \ref{['fig:annotated_grid_ap']} in Appendix \ref{['accuracy_ece_appendix']} for complete list of figures.
  • Figure 2: Accuracy versus Expected Calibration Error (ECE) comparison between ATCP, APS and LAC methods across different LLMs and five datasets i.e. CosmosQA, HaluDial, HaluSum, HellaSwag, MMLU. The ideal model should have high accuracy and low ECE, indicating accurate predictions with well calibrated uncertainty quantification (upper-left of the plot). The ECE of ATCP shows significant improvement compared to baseline methods. Please refer to \ref{['fig:annotated_grid_llm_ap']} in Appendix \ref{['accuracy_ece_appendix']} for the complete list of figures.
  • Figure 3: Performance comparison of CAP (Ours), APS, and LAC on Llava-v1.6-34B (VLM) and Yi-34B (LLM) across four metrics: (i) accuracy, (ii) set size, (iii) AUROC, and (iv) AUARC. Each figure shows model performance across ten benchmark datasets, illustrating the impact of conformal method on uncertainty metrics.
  • Figure 4: Accuracy vs. Expected Calibration Error (ECE) comparison of CAP, APS, and LAC across various VLMs and five datasets: MMBench, ScienceQA, OODCV, SEEDBench, and AI2D. An ideal model has high accuracy and low ECE (upper-left). ATCP shows significant ECE improvement over baselines.
  • Figure 5: Accuracy versus Expected Calibration Error (ECE) comparison between CAP, APS and LAC methods across different VLMs and five datasets i.e. MMBench, ScienceQA, OODCV, SEEDBench, AI2D. The ideal model should have high accuracy and low ECE, indicating accurate predictions with well calibrated uncertainty quantification (upper-left of the plot). The ECE of ATCP shows significant improvement compared to baseline methods.
  • ...and 8 more figures

Theorems & Definitions (2)

  • Theorem 1: Conformal Coverage Guarantee
  • proof