Table of Contents
Fetching ...

TeVAE: A Variational Autoencoder Approach for Discrete Online Anomaly Detection in Variable-state Multivariate Time-series Data

Lucas Correia, Jan-Christoph Goos, Philipp Klein, Thomas Bäck, Anna V. Kononova

TL;DR

A temporal variational autoencoder that can detect anomalies with minimal false positives when trained on unlabelled data and has the potential to perform well with a smaller training and validation subset but requires a more sophisticated threshold estimation method.

Abstract

As attention to recorded data grows in the realm of automotive testing and manual evaluation reaches its limits, there is a growing need for automatic online anomaly detection. This real-world data is complex in many ways and requires the modelling of testee behaviour. To address this, we propose a temporal variational autoencoder (TeVAE) that can detect anomalies with minimal false positives when trained on unlabelled data. Our approach also avoids the bypass phenomenon and introduces a new method to remap individual windows to a continuous time series. Furthermore, we propose metrics to evaluate the detection delay and root-cause capability of our approach and present results from experiments on a real-world industrial data set. When properly configured, TeVAE flags anomalies only 6% of the time wrongly and detects 65% of anomalies present. It also has the potential to perform well with a smaller training and validation subset but requires a more sophisticated threshold estimation method.

TeVAE: A Variational Autoencoder Approach for Discrete Online Anomaly Detection in Variable-state Multivariate Time-series Data

TL;DR

A temporal variational autoencoder that can detect anomalies with minimal false positives when trained on unlabelled data and has the potential to perform well with a smaller training and validation subset but requires a more sophisticated threshold estimation method.

Abstract

As attention to recorded data grows in the realm of automotive testing and manual evaluation reaches its limits, there is a growing need for automatic online anomaly detection. This real-world data is complex in many ways and requires the modelling of testee behaviour. To address this, we propose a temporal variational autoencoder (TeVAE) that can detect anomalies with minimal false positives when trained on unlabelled data. Our approach also avoids the bypass phenomenon and introduces a new method to remap individual windows to a continuous time series. Furthermore, we propose metrics to evaluate the detection delay and root-cause capability of our approach and present results from experiments on a real-world industrial data set. When properly configured, TeVAE flags anomalies only 6% of the time wrongly and detects 65% of anomalies present. It also has the potential to perform well with a smaller training and validation subset but requires a more sophisticated threshold estimation method.
Paper Structure (23 sections, 27 equations, 5 figures, 8 tables, 1 algorithm)

This paper contains 23 sections, 27 equations, 5 figures, 8 tables, 1 algorithm.

Figures (5)

  • Figure 1: Features of an anomaly-free (black) and an anomalous (red) measurement plotted with respect to time. The anomalous measurement plotted represents a scenario where the wheel diameter has not been set correctly. The amplitude axis is z-score normalised to comply with confidentiality guidelines.
  • Figure 2: Example plot for the autocorrelation as a function of lags. The band shown represents the confidence interval, below which we assume the autocorrelation is no longer statistically significant.
  • Figure 3: An illustration of the proposed TeVAE model. Blue shapes designate trainable models, orange deterministic tensors and green distribution parameters. The shape of each tensor is designated below it. During training $\textbf{Z}$ is used as the value matrix, denoted by the solid arrow, whereas during inference $\boldsymbol{\mu}_\textbf{Z}$ is used as the value matrix, denoted by the traced arrow. Topologically, TeVAE resembles MA-VAE correia_ma-vae_2023.
  • Figure 4: Mean-type reverse windowing process illustrated. The grey boxes represent individual windows.
  • Figure 5: Theoretical delay $\delta_\text{theory}$ plotted against time $t$. The last-type reverse-windowing is represented by a solid blue line and the first and mean-type by a solid green line. The red line represents the start of a hypothetical sub-sequence anomaly.