Compact Bayesian Neural Networks via pruned MCMC sampling

Ratneel Deo; Scott Sisson; Jody M. Webster; Rohitash Chandra

Compact Bayesian Neural Networks via pruned MCMC sampling

Ratneel Deo, Scott Sisson, Jody M. Webster, Rohitash Chandra

TL;DR

The paper tackles the computational burden of Bayesian neural networks by integrating Langevin MCMC sampling with post-training pruning to produce compact BNNs that retain uncertainty estimates. It introduces a pruning framework based on signal-to-noise and signal-plus-noise criteria, followed by a resampling step to reclaim performance, and validates the approach across benchmark regression/classification tasks and reef-core lithology data. Key findings show structured pruning plus post-pruning resampling yields major parameter reduction (up to substantial proportions) with preserved or improved predictive performance and robust uncertainty quantification, supported by convergence diagnostics. The work advances efficient probabilistic modeling for real-world, resource-constrained settings and suggests further extensions to CNNs, dynamic pruning, and knowledge distillation for broader applicability.

Abstract

Bayesian Neural Networks (BNNs) offer robust uncertainty quantification in model predictions, but training them presents a significant computational challenge. This is mainly due to the problem of sampling multimodal posterior distributions using Markov Chain Monte Carlo (MCMC) sampling and variational inference algorithms. Moreover, the number of model parameters scales exponentially with additional hidden layers, neurons, and features in the dataset. Typically, a significant portion of these densely connected parameters are redundant and pruning a neural network not only improves portability but also has the potential for better generalisation capabilities. In this study, we address some of the challenges by leveraging MCMC sampling with network pruning to obtain compact probabilistic models having removed redundant parameters. We sample the posterior distribution of model parameters (weights and biases) and prune weights with low importance, resulting in a compact model. We ensure that the compact BNN retains its ability to estimate uncertainty via the posterior distribution while retaining the model training and generalisation performance accuracy by adapting post-pruning resampling. We evaluate the effectiveness of our MCMC pruning strategy on selected benchmark datasets for regression and classification problems through empirical result analysis. We also consider two coral reef drill-core lithology classification datasets to test the robustness of the pruning model in complex real-world datasets. We further investigate if refining compact BNN can retain any loss of performance. Our results demonstrate the feasibility of training and pruning BNNs using MCMC whilst retaining generalisation performance with over 75% reduction in network size. This paves the way for developing compact BNN models that provide uncertainty estimates for real-world applications.

Compact Bayesian Neural Networks via pruned MCMC sampling

TL;DR

Abstract

Paper Structure (23 sections, 23 equations, 11 figures, 3 tables, 1 algorithm)

This paper contains 23 sections, 23 equations, 11 figures, 3 tables, 1 algorithm.

Introduction
Background and Related Work
Neural network pruning methods
Bayesian Inference for Neural Networks
Langevin Bayesian Neural Networks
Model and Likelihood in BNNs
Methodology
Pruning Algorithm
Signal-to-Noise ratio
Signal-plus-noise ratio
Datasets
Experiments and Results
Data Preprocessing and model selection
Evaluation Metrics
Results and Analysis
...and 8 more sections

Figures (11)

Figure 1: Framework for compact BNNs with network pruning post-sampling (training) where the weights/biases that do not contribute significantly to the posterior are removed. The compact BNN is later refined using the same training data to potentially regain the performance lost from pruning.
Figure 2: Lithology classification of drill core through visual analysis for a segment of Core 5R from Expedition 325 drill hole number M0033A. The core is taken at 43 meters depth below the seafloor.
Figure 3: Class distribution for Expedition 325 and 310 Datasets
Figure 4: Performance accuracy (RMSE) for different pruning methods and pruning levels for given datasets (Lazer, Sunspot, and Abalone). Each method is distinguished by a unique colour scheme, with darker shades representing the original network and lighter shades representing resampled networks. Error bars indicate the standard deviation, highlighting variability in RMSE.
Figure 5: Classification accuracy of Bayesian neural networks across different pruning methods and pruning levels. Each method is represented by a unique colour scheme, with darker shades indicating the original network and lighter shades indicating resampled networks. The error bars represent the standard deviation, highlighting variability in performance.
...and 6 more figures

Compact Bayesian Neural Networks via pruned MCMC sampling

TL;DR

Abstract

Compact Bayesian Neural Networks via pruned MCMC sampling

Authors

TL;DR

Abstract

Table of Contents

Figures (11)