Active Learning with Fully Bayesian Neural Networks for Discontinuous and Nonstationary Data

Maxim Ziatdinov

Active Learning with Fully Bayesian Neural Networks for Discontinuous and Nonstationary Data

Maxim Ziatdinov

TL;DR

This work tackles active learning in physical sciences where data are scarce and system behavior exhibits discontinuities and nonstationarity. It advocates Fully Bayesian Neural Networks (FBNNs) trained with No-U-Turn Sampling as a probabilistic surrogate, offering improved uncertainty quantification over traditional Gaussian Processes (GPs). Through a battery of 1D and 2D test problems modeling phase transitions and nonstationary dynamics, FBNNs demonstrate competitive or superior predictive accuracy and uncertainty estimates, particularly in regions with abrupt changes. The findings suggest FBNNs are a practical, more reliable option for autonomous experimentation and modeling in small-data, low-to-moderate dimensional problems, with potential for extension to higher-dimensional applications.

Abstract

Active learning optimizes the exploration of large parameter spaces by strategically selecting which experiments or simulations to conduct, thus reducing resource consumption and potentially accelerating scientific discovery. A key component of this approach is a probabilistic surrogate model, typically a Gaussian Process (GP), which approximates an unknown functional relationship between control parameters and a target property. However, conventional GPs often struggle when applied to systems with discontinuities and non-stationarities, prompting the exploration of alternative models. This limitation becomes particularly relevant in physical science problems, which are often characterized by abrupt transitions between different system states and rapid changes in physical property behavior. Fully Bayesian Neural Networks (FBNNs) serve as a promising substitute, treating all neural network weights probabilistically and leveraging advanced Markov Chain Monte Carlo techniques for direct sampling from the posterior distribution. This approach enables FBNNs to provide reliable predictive distributions, crucial for making informed decisions under uncertainty in the active learning setting. Although traditionally considered too computationally expensive for 'big data' applications, many physical sciences problems involve small amounts of data in relatively low-dimensional parameter spaces. Here, we assess the suitability and performance of FBNNs with the No-U-Turn Sampler for active learning tasks in the 'small data' regime, highlighting their potential to enhance predictive accuracy and reliability on test functions relevant to problems in physical sciences.

Active Learning with Fully Bayesian Neural Networks for Discontinuous and Nonstationary Data

TL;DR

Abstract

Paper Structure (11 sections, 7 equations, 6 figures, 1 algorithm)

This paper contains 11 sections, 7 equations, 6 figures, 1 algorithm.

Introduction
Related work
Methods
Fully Bayesian Neural Networks
Gaussian Process
Active Learning Strategy
Choice of hyperparameters
Model training
Test functions and datasets
Results and Discussion
Conclusions

Figures (6)

Figure 1: Fully Bayesian neural networks can outperform their Gaussian process counterparts on active learning tasks, particularly for systems exhibiting rapid changes in behavior and/or discontinuous transitions between different states. These types of behavior are often encountered in various physical science domains.
Figure 2: Fully Bayesian neural network (FBNN). The deterministic weights of traditional neural networks (a) are replaced with probabilistic distributions in FBNNs (b). (c) FBNN as probabilistic graphical model: colored circles depict observed variables while unobserved variables are in empty circles. (d) Schematic of active learning process.
Figure 3: Active learning results for 1D test functions. (a) Ground truth function values with uniformly initialized starting ('seed') points. (b, c) Results of active learning with Gaussian process (GP) and Bayesian neural network (FBNN). (d) Mean squared error (MSE) as a function of active learning steps.
Figure 4: Perfomance evaluation of GP and FBNN in terms of negative log predictive density (NLPD) over active learning steps for the 1D test functions.
Figure 5: Performance evaluation in terms of MSE and NLPD at the final active learning step across (a) different FBNN architectures, (b) Different FBNN noise priors. The GP 'baseline' is shown by a green bar.
...and 1 more figures

Active Learning with Fully Bayesian Neural Networks for Discontinuous and Nonstationary Data

TL;DR

Abstract

Active Learning with Fully Bayesian Neural Networks for Discontinuous and Nonstationary Data

Authors

TL;DR

Abstract

Table of Contents

Figures (6)