Table of Contents
Fetching ...

Efficient Bayesian Updates for Deep Learning via Laplace Approximations

Denis Huseljic, Marek Herde, Lukas Rauch, Paul Hahn, Zhixin Huang, Daniel Kottke, Stephan Vogt, Bernhard Sick

TL;DR

The paper tackles the challenge of updating deep neural networks with new data without full retraining under stationary data distributions. It introduces a Bayesian update based on a last-layer Laplace approximation, yielding a Gaussian posterior and enabling a fast, closed-form Hessian inversion for online updates. Experiments across image and text tasks show the update matches retraining performance while offering substantial speedups and improving deep active learning strategies. The approach is compatible with modern uncertainty-aware models like SNGP and points to broader use in online and resource-constrained settings.

Abstract

Since training deep neural networks takes significant computational resources, extending the training dataset with new data is difficult, as it typically requires complete retraining. Moreover, specific applications do not allow costly retraining due to time or computational constraints. We address this issue by proposing a novel Bayesian update method for deep neural networks by using a last-layer Laplace approximation. Concretely, we leverage second-order optimization techniques on the Gaussian posterior distribution of a Laplace approximation, computing the inverse Hessian matrix in closed form. This way, our method allows for fast and effective updates upon the arrival of new data in a stationary setting. A large-scale evaluation study across different data modalities confirms that our updates are a fast and competitive alternative to costly retraining. Furthermore, we demonstrate its applicability in a deep active learning scenario by using our update to improve existing selection strategies.

Efficient Bayesian Updates for Deep Learning via Laplace Approximations

TL;DR

The paper tackles the challenge of updating deep neural networks with new data without full retraining under stationary data distributions. It introduces a Bayesian update based on a last-layer Laplace approximation, yielding a Gaussian posterior and enabling a fast, closed-form Hessian inversion for online updates. Experiments across image and text tasks show the update matches retraining performance while offering substantial speedups and improving deep active learning strategies. The approach is compatible with modern uncertainty-aware models like SNGP and points to broader use in online and resource-constrained settings.

Abstract

Since training deep neural networks takes significant computational resources, extending the training dataset with new data is difficult, as it typically requires complete retraining. Moreover, specific applications do not allow costly retraining due to time or computational constraints. We address this issue by proposing a novel Bayesian update method for deep neural networks by using a last-layer Laplace approximation. Concretely, we leverage second-order optimization techniques on the Gaussian posterior distribution of a Laplace approximation, computing the inverse Hessian matrix in closed form. This way, our method allows for fast and effective updates upon the arrival of new data in a stationary setting. A large-scale evaluation study across different data modalities confirms that our updates are a fast and competitive alternative to costly retraining. Furthermore, we demonstrate its applicability in a deep active learning scenario by using our update to improve existing selection strategies.
Paper Structure (18 sections, 22 equations, 10 figures, 1 table, 1 algorithm)

This paper contains 18 sections, 22 equations, 10 figures, 1 table, 1 algorithm.

Figures (10)

  • Figure 1: Comparison of different BNNs on the two moons dataset. (a) The original model resulting from training on the original dataset. (b) The typical MC-based update applied to the original model kirsch2022marginaltan2021BEMPS. (c) Our update applied to the original model. (d) The model resulting from retraining on both the original and the new dataset.
  • Figure 2: The left plot shows the predicted probabilities of the positive class for each hypothesis (colored lines) drawn from a BNN as well as the mean (black solid line) and standard deviation (black dashed line) of its predictive distribution. The right plot shows updated weights for each hypothesis and the predictive distribution after observing additional instances (green).
  • Figure 3: Accuracies after updating with different values for $\gamma$ in comparison to the baseline DNN and retraining.
  • Figure 4: Accuracy improvement curves for six benchmark datasets, showing the difference in accuracy between retrained and updated DNNs for varying sizes of $\mathcal{D}$.
  • Figure 5: Accuracy curves for three benchmark datasets after updating and retraining DNNs for varying sizes of $\mathcal{D}^{\oplus}$.
  • ...and 5 more figures