Sequential Bayesian Neural Subnetwork Ensembles

Sanket Jantre; Shrijita Bhattacharya; Nathan M. Urban; Byung-Jun Yoon; Tapabrata Maiti; Prasanna Balaprakash; Sandeep Madireddy

Sequential Bayesian Neural Subnetwork Ensembles

Sanket Jantre, Shrijita Bhattacharya, Nathan M. Urban, Byung-Jun Yoon, Tapabrata Maiti, Prasanna Balaprakash, Sandeep Madireddy

TL;DR

The paper addresses the high cost and limited flexibility of traditional deep ensembles by introducing SeBayS, a sequential ensembling framework for Bayesian neural networks that maintains full sparsity throughout training. It combines a one-shot exploration phase with multiple exploitation phases to generate diverse Bayesian subnetworks in a single forward pass, using a prune-grow mechanism and a sparsity-aware VI objective. Empirically, SeBayS and its BayS variant outperform dense and sparse baselines in accuracy, calibration, OoD detection, and adversarial robustness on CIFAR-10/100 with Wide ResNet-28-10, while reducing training and memory costs. The approach offers a scalable, robust ensemble technique for uncertainty estimation, with potential extensions to structured sparsity and energy-efficient uncertainty frameworks.

Abstract

Deep ensembles have emerged as a powerful technique for improving predictive performance and enhancing model robustness across various applications by leveraging model diversity. However, traditional deep ensemble methods are often computationally expensive and rely on deterministic models, which may limit their flexibility. Additionally, while sparse subnetworks of dense models have shown promise in matching the performance of their dense counterparts and even enhancing robustness, existing methods for inducing sparsity typically incur training costs comparable to those of training a single dense model, as they either gradually prune the network during training or apply thresholding post-training. In light of these challenges, we propose an approach for sequential ensembling of dynamic Bayesian neural subnetworks that consistently maintains reduced model complexity throughout the training process while generating diverse ensembles in a single forward pass. Our approach involves an initial exploration phase to identify high-performing regions within the parameter space, followed by multiple exploitation phases that take advantage of the compactness of the sparse model. These exploitation phases quickly converge to different minima in the energy landscape, corresponding to high-performing subnetworks that together form a diverse and robust ensemble. We empirically demonstrate that our proposed approach outperforms traditional dense and sparse deterministic and Bayesian ensemble models in terms of prediction accuracy, uncertainty estimation, out-of-distribution detection, and adversarial robustness.

Sequential Bayesian Neural Subnetwork Ensembles

TL;DR

Abstract

Paper Structure (19 sections, 12 equations, 3 figures, 6 tables, 1 algorithm)

This paper contains 19 sections, 12 equations, 3 figures, 6 tables, 1 algorithm.

Introduction
Related Work
Sequential Bayesian Neural Subnetwork Ensembles
Preliminaries
Bayesian Neural Subnetworks
Sequential Ensembling Strategy
Experimental Results
Ensemble Analysis
Function Space Analysis
Effect of Ensemble size
Conclusion and Discussion
Prune-Grow Criterion
Reproducibility Considerations
Hyperparameters
Data Augmentation
...and 4 more sections

Figures (3)

Figure 1: Illustration of SeBayS ensemble: Our approach includes an exploration phase followed by multiple exploitation phases to create Bayesian subnetworks. SeBayS ensemble prediction is obtained by combining their predictions.
Figure 2: Training trajectories of base learners obtained by parallel and sequential ensembling of Bayesian subnetworks -- BayS Ensemble and SeBayS Ensemble in Wide ResNet28-10 on CIFAR-10 and CIFAR-100 experiments.
Figure 3: Performance of base learners and their ensembles as ensemble size M varies in CIFAR-10 experiment.

Sequential Bayesian Neural Subnetwork Ensembles

TL;DR

Abstract

Sequential Bayesian Neural Subnetwork Ensembles

Authors

TL;DR

Abstract

Table of Contents

Figures (3)