Table of Contents
Fetching ...

Position: Bayesian Deep Learning is Needed in the Age of Large-Scale AI

Theodore Papamarkou, Maria Skoularidou, Konstantina Palla, Laurence Aitchison, Julyan Arbel, David Dunson, Maurizio Filippone, Vincent Fortuin, Philipp Hennig, José Miguel Hernández-Lobato, Aliaksandr Hubin, Alexander Immer, Theofanis Karaletsos, Mohammad Emtiyaz Khan, Agustinus Kristiadi, Yingzhen Li, Stephan Mandt, Christopher Nemeth, Michael A. Osborne, Tim G. J. Rudner, David Rügamer, Yee Whye Teh, Max Welling, Andrew Gordon Wilson, Ruqi Zhang

TL;DR

The paper argues that Bayesian Deep Learning is essential in the age of large-scale AI because it provides principled uncertainty quantification, data efficiency, and adaptability to evolving domains, all of which are crucial for safe and reliable deployment of foundation models. It surveys the strengths and current challenges of BDL—posterior inference, priors, scalability, and foundation-model integration—and outlines concrete future directions, including novel posterior samplers, hybrid Bayesian approaches, deep kernel methods, semi/self-supervised Bayesian learning, probabilistic numerics, and compression. By linking these methodological advances to practical needs in uncertainty-aware decision-making, the authors advocate for integrating BDL with large models to unlock robust, trustworthy AI across domains. The discussion emphasizes the potential of BDL to enhance reliability and interpretability, while calling for scalable tooling, benchmarks, and application-driven development, particularly for foundation-model workflows.

Abstract

In the current landscape of deep learning research, there is a predominant emphasis on achieving high predictive accuracy in supervised tasks involving large image and language datasets. However, a broader perspective reveals a multitude of overlooked metrics, tasks, and data types, such as uncertainty, active and continual learning, and scientific data, that demand attention. Bayesian deep learning (BDL) constitutes a promising avenue, offering advantages across these diverse settings. This paper posits that BDL can elevate the capabilities of deep learning. It revisits the strengths of BDL, acknowledges existing challenges, and highlights some exciting research avenues aimed at addressing these obstacles. Looking ahead, the discussion focuses on possible ways to combine large-scale foundation models with BDL to unlock their full potential.

Position: Bayesian Deep Learning is Needed in the Age of Large-Scale AI

TL;DR

The paper argues that Bayesian Deep Learning is essential in the age of large-scale AI because it provides principled uncertainty quantification, data efficiency, and adaptability to evolving domains, all of which are crucial for safe and reliable deployment of foundation models. It surveys the strengths and current challenges of BDL—posterior inference, priors, scalability, and foundation-model integration—and outlines concrete future directions, including novel posterior samplers, hybrid Bayesian approaches, deep kernel methods, semi/self-supervised Bayesian learning, probabilistic numerics, and compression. By linking these methodological advances to practical needs in uncertainty-aware decision-making, the authors advocate for integrating BDL with large models to unlock robust, trustworthy AI across domains. The discussion emphasizes the potential of BDL to enhance reliability and interpretability, while calling for scalable tooling, benchmarks, and application-driven development, particularly for foundation-model workflows.

Abstract

In the current landscape of deep learning research, there is a predominant emphasis on achieving high predictive accuracy in supervised tasks involving large image and language datasets. However, a broader perspective reveals a multitude of overlooked metrics, tasks, and data types, such as uncertainty, active and continual learning, and scientific data, that demand attention. Bayesian deep learning (BDL) constitutes a promising avenue, offering advantages across these diverse settings. This paper posits that BDL can elevate the capabilities of deep learning. It revisits the strengths of BDL, acknowledges existing challenges, and highlights some exciting research avenues aimed at addressing these obstacles. Looking ahead, the discussion focuses on possible ways to combine large-scale foundation models with BDL to unlock their full potential.
Paper Structure (33 sections, 39 equations, 2 figures)

This paper contains 33 sections, 39 equations, 2 figures.

Figures (2)

  • Figure 1: Popular LLM chat assistants, such as Bing Chat (using GPT-4) and LLAMA-2-70B, often produce wrong answer with very high confidence, indicating that their confidence is not calibrated. BDL has traditionally been used to overcome this kind of overconfidence problem and yet BDL is underutilized in the LLM era. Note that OS(=O)(=O)O is a textual representation of the well-known molecule H$_2$SO$_4$ and can easily be looked up on Wikipedia. Emphasis and ellipsis ours. Accessed on 2024-01-23.
  • Figure 2: Different BDL methods for approximating a posterior $p(\theta \mid \mathcal{D})$ on a parameter space $\Theta$. While Laplace and Gaussian-based variational approaches yield Gaussian approximations, they generally capture different local modes of the posterior. Ensemble methods use maximum a posteriori estimates as their samples.