Exploring the Potential of Bilevel Optimization for Calibrating Neural Networks

Gabriele Sanguin; Arjun Pakrashi; Marco Viola; Francesco Rinaldi

Exploring the Potential of Bilevel Optimization for Calibrating Neural Networks

Gabriele Sanguin, Arjun Pakrashi, Marco Viola, Francesco Rinaldi

TL;DR

Neural networks often produce overconfident predictions, harming decision-making in critical domains. The paper introduces BO4SC, a bilevel optimization framework that jointly learns predictions and calibrated confidence through an inner weighted cross-entropy objective and an outer BCE calibration objective, optimized via hypergradients for a dual-output network. Across toy (Blobs, Spirals) and BAC datasets, BO4SC achieves lower ECE while maintaining or improving accuracy, and reveals interpretable weight dynamics that downweight ambiguous samples. This integrated self-calibration approach reduces the need for post-hoc calibration, offering a practical path toward more reliable uncertainty estimates in neural classifiers.

Abstract

Handling uncertainty is critical for ensuring reliable decision-making in intelligent systems. Modern neural networks are known to be poorly calibrated, resulting in predicted confidence scores that are difficult to use. This article explores improving confidence estimation and calibration through the application of bilevel optimization, a framework designed to solve hierarchical problems with interdependent optimization levels. A self-calibrating bilevel neural-network training approach is introduced to improve a model's predicted confidence scores. The effectiveness of the proposed framework is analyzed using toy datasets, such as Blobs and Spirals, as well as more practical simulated datasets, such as Blood Alcohol Concentration (BAC). It is compared with a well-known and widely used calibration strategy, isotonic regression. The reported experimental results reveal that the proposed bilevel optimization approach reduces the calibration error while preserving accuracy.

Exploring the Potential of Bilevel Optimization for Calibrating Neural Networks

TL;DR

Abstract

Exploring the Potential of Bilevel Optimization for Calibrating Neural Networks

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)