Boltzmann convolutions and Welford mean-variance layers with an application to time series forecasting and classification
Daniel Andrew Coulson, Martin T. Wells
TL;DR
The paper defines the ForeClassing problem, where the utility of a classification depends on future outcomes, and introduces ForeClassNet, an approximately Bayesian neural network that forecasts future time points and uses their means and uncertainties to inform classification. It introduces Boltzmann convolutional layers for multi-temporal representation and Welford mean-variance layers for online uncertainty updates, together enabling a forecasting-informed decision process. The approach is validated via simulations and real-world data (ECG and stock prices), showing substantial accuracy gains (up to ~30% in the financial example) and robustness to adversarial settings, with additional explainability via saliency plots. The work contributes new neural network layers, a motivating ForeClassing theorem, and demonstrates practical impact for sequential decision-making tasks that depend on future information.
Abstract
In this paper we propose a novel problem called the ForeClassing problem where the loss of a classification decision is only observed at a future time point after the classification decision has to be made. To solve this problem, we propose an approximately Bayesian deep neural network architecture called ForeClassNet for time series forecasting and classification. This network architecture forces the network to consider possible future realizations of the time series, by forecasting future time points and their likelihood of occurring, before making its final classification decision. To facilitate this, we introduce two novel neural network layers, Welford mean-variance layers and Boltzmann convolutional layers. Welford mean-variance layers allow networks to iteratively update their estimates of the mean and variance for the forecasted time points for each inputted time series to the network through successive forward passes, which the model can then consider in combination with a learned representation of the observed realizations of the time series for its classification decision. Boltzmann convolutional layers are linear combinations of approximately Bayesian convolutional layers with different filter lengths, allowing the model to learn multitemporal resolution representations of the input time series, and which resolutions to focus on within a given Boltzmann convolutional layer through a Boltzmann distribution. Through several simulation scenarios and two real world applications we demonstrate ForeClassNet achieves superior performance compared with current state of the art methods including a near 30% improvement in test set accuracy in our financial example compared to the second best performing model.
