Table of Contents
Fetching ...

Calibration of Continual Learning Models

Lanpei Li, Elia Piccoli, Andrea Cossu, Davide Bacciu, Vincenzo Lomonaco

TL;DR

The paper tackles calibration in continual learning (CL), where non-stationary data streams cause forgetting and unreliable confidence estimates. It evaluates post-processing and self-calibration methods and introduces Replayed Calibration (RC) that replays previous validation data during calibration to maintain reliability across experiences. Experiments across four benchmarks (Split MNIST, Split CIFAR100, EuroSAT, Atari) and multiple CL strategies show that RC improves calibration by large margins and often supports better accuracy, with additional gains when combined with DER++. The work demonstrates that calibrated CL models are more trustworthy in real-world deployments and outlines future directions such as self-calibration and extensions to reinforcement learning and natural language processing.

Abstract

Continual Learning (CL) focuses on maximizing the predictive performance of a model across a non-stationary stream of data. Unfortunately, CL models tend to forget previous knowledge, thus often underperforming when compared with an offline model trained jointly on the entire data stream. Given that any CL model will eventually make mistakes, it is of crucial importance to build calibrated CL models: models that can reliably tell their confidence when making a prediction. Model calibration is an active research topic in machine learning, yet to be properly investigated in CL. We provide the first empirical study of the behavior of calibration approaches in CL, showing that CL strategies do not inherently learn calibrated models. To mitigate this issue, we design a continual calibration approach that improves the performance of post-processing calibration methods over a wide range of different benchmarks and CL strategies. CL does not necessarily need perfect predictive models, but rather it can benefit from reliable predictive models. We believe our study on continual calibration represents a first step towards this direction.

Calibration of Continual Learning Models

TL;DR

The paper tackles calibration in continual learning (CL), where non-stationary data streams cause forgetting and unreliable confidence estimates. It evaluates post-processing and self-calibration methods and introduces Replayed Calibration (RC) that replays previous validation data during calibration to maintain reliability across experiences. Experiments across four benchmarks (Split MNIST, Split CIFAR100, EuroSAT, Atari) and multiple CL strategies show that RC improves calibration by large margins and often supports better accuracy, with additional gains when combined with DER++. The work demonstrates that calibrated CL models are more trustworthy in real-world deployments and outlines future directions such as self-calibration and extensions to reinforcement learning and natural language processing.

Abstract

Continual Learning (CL) focuses on maximizing the predictive performance of a model across a non-stationary stream of data. Unfortunately, CL models tend to forget previous knowledge, thus often underperforming when compared with an offline model trained jointly on the entire data stream. Given that any CL model will eventually make mistakes, it is of crucial importance to build calibrated CL models: models that can reliably tell their confidence when making a prediction. Model calibration is an active research topic in machine learning, yet to be properly investigated in CL. We provide the first empirical study of the behavior of calibration approaches in CL, showing that CL strategies do not inherently learn calibrated models. To mitigate this issue, we design a continual calibration approach that improves the performance of post-processing calibration methods over a wide range of different benchmarks and CL strategies. CL does not necessarily need perfect predictive models, but rather it can benefit from reliable predictive models. We believe our study on continual calibration represents a first step towards this direction.
Paper Structure (23 sections, 24 figures, 5 tables)

This paper contains 23 sections, 24 figures, 5 tables.

Figures (24)

  • Figure 1: A CL model $f_{\theta}$ is trained on a sequences of $k$ experiences (or tasks). The model accuracy on the class "cat" decreases over time. Its confidence decreases much faster. Therefore, the model becomes less calibrated over the course of its learning phase. A calibrated CL model, which is the objective of this paper, should output a confidence which is equal to the average accuracy. A calibrated model knows what to expect, on average, as a result of its predictions.
  • Figure 2: Continual calibration is performed on a stream of experiences (top) by applying either self-calibration (bottom left) or post-processing calibration (bottom right). Self-calibration approaches like Entropy Regularization (HR) regularize the training loss at each minibatch. Post-processing calibration like Temperature Scaling (TS) and Matrix/Vector scaling (MS/VS) are applied only at the end of each experience. Our Replayed Calibration approach is applicable alongside any post-processing methods.
  • Figure 3: Accuracy of Naive on Split MNIST.
  • Figure 4: Calibration diagram for Naive on Split CIFAR100.
  • Figure 5: Calibration diagram for Replay on Atari.
  • ...and 19 more figures

Theorems & Definitions (3)

  • Definition 1
  • Definition 2
  • Definition 3