Table of Contents
Fetching ...

Time Series Compression using Quaternion Valued Neural Networks and Quaternion Backpropagation

Johannes Pöppelbaum, Andreas Schwung

TL;DR

A novel quaternionic time series compression methodology where a long time series is divided into segments of data, extract the min, max, mean and standard deviation of these chunks as representative features and encapsulate them in a quaternion, yielding a quaternion valued time series.

Abstract

We propose a novel quaternionic time-series compression methodology where we divide a long time-series into segments of data, extract the min, max, mean and standard deviation of these chunks as representative features and encapsulate them in a quaternion, yielding a quaternion valued time-series. This time-series is processed using quaternion valued neural network layers, where we aim to preserve the relation between these features through the usage of the Hamilton product. To train this quaternion neural network, we derive quaternion backpropagation employing the GHR calculus, which is required for a valid product and chain rule in quaternion space. Furthermore, we investigate the connection between the derived update rules and automatic differentiation. We apply our proposed compression method on the Tennessee Eastman Dataset, where we perform fault classification using the compressed data in two settings: a fully supervised one and in a semi supervised, contrastive learning setting. Both times, we were able to outperform real valued counterparts as well as two baseline models: one with the uncompressed time-series as the input and the other with a regular downsampling using the mean. Further, we could improve the classification benchmark set by SimCLR-TS from 81.43% to 83.90%.

Time Series Compression using Quaternion Valued Neural Networks and Quaternion Backpropagation

TL;DR

A novel quaternionic time series compression methodology where a long time series is divided into segments of data, extract the min, max, mean and standard deviation of these chunks as representative features and encapsulate them in a quaternion, yielding a quaternion valued time series.

Abstract

We propose a novel quaternionic time-series compression methodology where we divide a long time-series into segments of data, extract the min, max, mean and standard deviation of these chunks as representative features and encapsulate them in a quaternion, yielding a quaternion valued time-series. This time-series is processed using quaternion valued neural network layers, where we aim to preserve the relation between these features through the usage of the Hamilton product. To train this quaternion neural network, we derive quaternion backpropagation employing the GHR calculus, which is required for a valid product and chain rule in quaternion space. Furthermore, we investigate the connection between the derived update rules and automatic differentiation. We apply our proposed compression method on the Tennessee Eastman Dataset, where we perform fault classification using the compressed data in two settings: a fully supervised one and in a semi supervised, contrastive learning setting. Both times, we were able to outperform real valued counterparts as well as two baseline models: one with the uncompressed time-series as the input and the other with a regular downsampling using the mean. Further, we could improve the classification benchmark set by SimCLR-TS from 81.43% to 83.90%.
Paper Structure (59 sections, 17 theorems, 133 equations, 5 figures, 6 tables)

This paper contains 59 sections, 17 theorems, 133 equations, 5 figures, 6 tables.

Key Result

Proposition 1

For the derivative of a quaternion valued function $f(\mathdutchcal{q}), \mathdutchcal{q} \in \mathbb{H}$ following equ:naive_quaternion_derivation, the product rule does not hold.

Figures (5)

  • Figure 1: Visualization of the gradient flow
  • Figure 2: Illustration of the quaternionic time-series compression methodology.
  • Figure 3: Example illustrating the compression algorithm.
  • Figure 4: Obtained results when training for 50 times with the best determined hyperparameter and different random parameter initialization. Note that in \ref{['subfig:3c4l_high']} one outlier at $4.55\%$ for the real eq. features boxplot is not displayed.
  • Figure 5: Visiualization of the gradient flow for parameter $w_0$ obtained with TorchViz

Theorems & Definitions (35)

  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Theorem 3
  • proof
  • ...and 25 more