Big Batch Bayesian Active Learning by Considering Predictive Probabilities
Sebastian W. Ober, Samuel Power, Tom Diethe, Henry B. Moss
TL;DR
The paper identifies that BatchBALD conflates epistemic and aleatoric uncertainty, which can lead to suboptimal batch selections. It introduces BBB-AL, a predictive-probability–driven acquisition that maximizes the entropy of predictive class probabilities across a batch, with a tractable closed form under Gaussian, independent-class assumptions and a general sample-based variant using Ledoit-Wolf shrinkage. Compared to BatchBALD, BBB-AL achieves better accuracy and is significantly faster for large batch sizes on CIFAR-10 with a ResNet-8 surrogate, demonstrating practical scalability and improved data efficiency. The approach provides a scalable alternative for batch Bayesian active learning that better isolates and leverages epistemic uncertainty in classification tasks.
Abstract
We observe that BatchBALD, a popular acquisition function for batch Bayesian active learning for classification, can conflate epistemic and aleatoric uncertainty, leading to suboptimal performance. Motivated by this observation, we propose to focus on the predictive probabilities, which only exhibit epistemic uncertainty. The result is an acquisition function that not only performs better, but is also faster to evaluate, allowing for larger batches than before.
