Table of Contents
Fetching ...

Boltzmann convolutions and Welford mean-variance layers with an application to time series forecasting and classification

Daniel Andrew Coulson, Martin T. Wells

TL;DR

The paper defines the ForeClassing problem, where the utility of a classification depends on future outcomes, and introduces ForeClassNet, an approximately Bayesian neural network that forecasts future time points and uses their means and uncertainties to inform classification. It introduces Boltzmann convolutional layers for multi-temporal representation and Welford mean-variance layers for online uncertainty updates, together enabling a forecasting-informed decision process. The approach is validated via simulations and real-world data (ECG and stock prices), showing substantial accuracy gains (up to ~30% in the financial example) and robustness to adversarial settings, with additional explainability via saliency plots. The work contributes new neural network layers, a motivating ForeClassing theorem, and demonstrates practical impact for sequential decision-making tasks that depend on future information.

Abstract

In this paper we propose a novel problem called the ForeClassing problem where the loss of a classification decision is only observed at a future time point after the classification decision has to be made. To solve this problem, we propose an approximately Bayesian deep neural network architecture called ForeClassNet for time series forecasting and classification. This network architecture forces the network to consider possible future realizations of the time series, by forecasting future time points and their likelihood of occurring, before making its final classification decision. To facilitate this, we introduce two novel neural network layers, Welford mean-variance layers and Boltzmann convolutional layers. Welford mean-variance layers allow networks to iteratively update their estimates of the mean and variance for the forecasted time points for each inputted time series to the network through successive forward passes, which the model can then consider in combination with a learned representation of the observed realizations of the time series for its classification decision. Boltzmann convolutional layers are linear combinations of approximately Bayesian convolutional layers with different filter lengths, allowing the model to learn multitemporal resolution representations of the input time series, and which resolutions to focus on within a given Boltzmann convolutional layer through a Boltzmann distribution. Through several simulation scenarios and two real world applications we demonstrate ForeClassNet achieves superior performance compared with current state of the art methods including a near 30% improvement in test set accuracy in our financial example compared to the second best performing model.

Boltzmann convolutions and Welford mean-variance layers with an application to time series forecasting and classification

TL;DR

The paper defines the ForeClassing problem, where the utility of a classification depends on future outcomes, and introduces ForeClassNet, an approximately Bayesian neural network that forecasts future time points and uses their means and uncertainties to inform classification. It introduces Boltzmann convolutional layers for multi-temporal representation and Welford mean-variance layers for online uncertainty updates, together enabling a forecasting-informed decision process. The approach is validated via simulations and real-world data (ECG and stock prices), showing substantial accuracy gains (up to ~30% in the financial example) and robustness to adversarial settings, with additional explainability via saliency plots. The work contributes new neural network layers, a motivating ForeClassing theorem, and demonstrates practical impact for sequential decision-making tasks that depend on future information.

Abstract

In this paper we propose a novel problem called the ForeClassing problem where the loss of a classification decision is only observed at a future time point after the classification decision has to be made. To solve this problem, we propose an approximately Bayesian deep neural network architecture called ForeClassNet for time series forecasting and classification. This network architecture forces the network to consider possible future realizations of the time series, by forecasting future time points and their likelihood of occurring, before making its final classification decision. To facilitate this, we introduce two novel neural network layers, Welford mean-variance layers and Boltzmann convolutional layers. Welford mean-variance layers allow networks to iteratively update their estimates of the mean and variance for the forecasted time points for each inputted time series to the network through successive forward passes, which the model can then consider in combination with a learned representation of the observed realizations of the time series for its classification decision. Boltzmann convolutional layers are linear combinations of approximately Bayesian convolutional layers with different filter lengths, allowing the model to learn multitemporal resolution representations of the input time series, and which resolutions to focus on within a given Boltzmann convolutional layer through a Boltzmann distribution. Through several simulation scenarios and two real world applications we demonstrate ForeClassNet achieves superior performance compared with current state of the art methods including a near 30% improvement in test set accuracy in our financial example compared to the second best performing model.

Paper Structure

This paper contains 18 sections, 1 theorem, 24 equations, 7 figures, 11 tables.

Key Result

Theorem 1

Consider the probability space $(\Omega, \mathcal{F}, \mathbb{P})$ and the random variables $\boldsymbol{X}:\Omega \xrightarrow{} \mathbb{R}^{m}$, $\boldsymbol{X}_{*}:\Omega \xrightarrow{} \mathbb{R}^{k}$, and $Y:\Omega \xrightarrow{} \{0,1\}$, where $Y = f(\boldsymbol{X}, \boldsymbol{X}_{*})$ for s

Figures (7)

  • Figure 1: Plot of the Boltzmann distribution with 5 states for different values of temperature.
  • Figure 2: Graphic of our proposed network for time series forecasting and classification. The numbers in the brackets represents how many nodes are in a given layer for example Dense [$k$] represents a dense layer with $k$ nodes.
  • Figure 3: Saliency plot for the second time series in the test set which was in class 0 .
  • Figure 4: Saliency plot for the third time series in the test set which was in class 1 .
  • Figure 5: Saliency plot for the sixth time series in the test set which was in class 2 .
  • ...and 2 more figures

Theorems & Definitions (3)

  • Definition 1
  • Theorem 1
  • proof