Table of Contents
Fetching ...

Epistemic Neural Networks

Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy

TL;DR

The paper tackles the difficulty of reliable decision making under uncertainty by focusing on joint predictions rather than marginal predictions. It introduces epistemic neural networks (ENNs) as a broad function-space interface for expressing epistemic uncertainty, and presents the epinet, a lightweight architecture that augments any base neural network to yield high-quality joint predictions with substantially less computation than large ensembles. Theoretical results show ENNs subsume BNNs while enabling richer uncertainty representations, and empirical results on Neural Testbed and ImageNet demonstrate that epinet dramatically improves joint log-loss without sacrificing marginal performance, even when starting from pretrained models. This approach offers a practical pathway to calibrate and deploy uncertainty-aware models at scale, with public code to facilitate adoption and evaluation.

Abstract

Intelligence relies on an agent's knowledge of what it does not know. This capability can be assessed based on the quality of joint predictions of labels across multiple inputs. In principle, ensemble-based approaches produce effective joint predictions, but the computational costs of training large ensembles can become prohibitive. We introduce the epinet: an architecture that can supplement any conventional neural network, including large pretrained models, and can be trained with modest incremental computation to estimate uncertainty. With an epinet, conventional neural networks outperform very large ensembles, consisting of hundreds or more particles, with orders of magnitude less computation. The epinet does not fit the traditional framework of Bayesian neural networks. To accommodate development of approaches beyond BNNs, such as the epinet, we introduce the epistemic neural network (ENN) as an interface for models that produce joint predictions.

Epistemic Neural Networks

TL;DR

The paper tackles the difficulty of reliable decision making under uncertainty by focusing on joint predictions rather than marginal predictions. It introduces epistemic neural networks (ENNs) as a broad function-space interface for expressing epistemic uncertainty, and presents the epinet, a lightweight architecture that augments any base neural network to yield high-quality joint predictions with substantially less computation than large ensembles. Theoretical results show ENNs subsume BNNs while enabling richer uncertainty representations, and empirical results on Neural Testbed and ImageNet demonstrate that epinet dramatically improves joint log-loss without sacrificing marginal performance, even when starting from pretrained models. This approach offers a practical pathway to calibrate and deploy uncertainty-aware models at scale, with public code to facilitate adoption and evaluation.

Abstract

Intelligence relies on an agent's knowledge of what it does not know. This capability can be assessed based on the quality of joint predictions of labels across multiple inputs. In principle, ensemble-based approaches produce effective joint predictions, but the computational costs of training large ensembles can become prohibitive. We introduce the epinet: an architecture that can supplement any conventional neural network, including large pretrained models, and can be trained with modest incremental computation to estimate uncertainty. With an epinet, conventional neural networks outperform very large ensembles, consisting of hundreds or more particles, with orders of magnitude less computation. The epinet does not fit the traditional framework of Bayesian neural networks. To accommodate development of approaches beyond BNNs, such as the epinet, we introduce the epistemic neural network (ENN) as an interface for models that produce joint predictions.

Paper Structure

This paper contains 30 sections, 10 theorems, 31 equations, 14 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

[informal] There exists a decision problem and an ENN that attains small expected marginal log loss such that actions generated using the ENN perform no better than random guessing.

Figures (14)

  • Figure 1: Conventional neural nets generate marginal predictions, which do not distinguish genuine ambiguity from insufficiency of data. Joint predictions can make this distinction.
  • Figure 2: Quality of marginal and joint predictions across models on ImageNet (Section \ref{['sec:imagenet']}).
  • Figure 3: An ENN can incorporate the epistemic index $z \sim P_Z$ into its joint predictions. This allows an ENN to differentiate inevitable ambiguity from data insufficiency.
  • Figure 4: Epinet network architecture.
  • Figure 5: Quality of marginal and joint predictions across models on the Neural Testbed.
  • ...and 9 more figures

Theorems & Definitions (14)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 1
  • proof
  • Theorem 2
  • Theorem 4
  • proof
  • Lemma 1
  • ...and 4 more