Bayesian Topological Convolutional Neural Nets
Sarah Harkins Dayton, Hayden Everett, Ioannis Schizas, David L. Boothe, Vasileios Maroulas
TL;DR
The paper tackles calibration and uncertainty in image classification with limited or noisy data by proposing a Bayesian Topological Convolutional Neural Network (BTCNN). It unites topology-aware circle-based convolutional layers with Bayesian parameter learning, including a consistency-conditioned prior $p( heta| extbf{X})\, \propto p( heta)\exp(-\frac{1}{|\mathbf{X}|}\sum_{\mathbf{x}\in\mathbf{X}}C(\theta,\mathbf{x}))$, and uses variational inference to learn posterior distributions $q_\phi(\theta)$. Empirical results on USPS and related perturbations show BTCNNs outperform standard CNNs, TCNNs, and BNNs in data-scarce and noisy settings, with improved calibration metrics and better uncertainty quantification, including more meaningful OOD detection via mutual information. The work highlights that integrating topology-guided inductive biases with Bayesian inference yields more efficient, robust, and uncertainty-aware image classification, particularly when data are limited or degraded. Future directions include exploring other topological manifolds and learning the manifold alongside Bayesian priors to further enhance calibration and robustness.
Abstract
Convolutional neural networks (CNNs) have been established as the main workhorse in image data processing; nonetheless, they require large amounts of data to train, often produce overconfident predictions, and frequently lack the ability to quantify the uncertainty of their predictions. To address these concerns, we propose a new Bayesian topological CNN that promotes a novel interplay between topology-aware learning and Bayesian sampling. Specifically, it utilizes information from important manifolds to accelerate training while reducing calibration error by placing prior distributions on network parameters and properly learning appropriate posteriors. One important contribution of our work is the inclusion of a consistency condition in the learning cost, which can effectively modify the prior distributions to improve the performance of our novel network architecture. We evaluate the model on benchmark image classification datasets and demonstrate its superiority over conventional CNNs, Bayesian neural networks (BNNs), and topological CNNs. In particular, we supply evidence that our method provides an advantage in situations where training data is limited or corrupted. Furthermore, we show that the new model allows for better uncertainty quantification than standard BNNs since it can more readily identify examples of out-of-distribution data on which it has not been trained. Our results highlight the potential of our novel hybrid approach for more efficient and robust image classification.
