Table of Contents
Fetching ...

Adaptive Bernstein Change Detector for High-Dimensional Data Streams

Marco Heyden, Edouard Fouché, Vadim Arzamasov, Tanja Fenn, Florian Kalinke, Klemens Böhm

TL;DR

ABCD provides a principled, online detector for changes in high-dimensional data streams by tracking the reconstruction loss $L_t$ of a learned encoder–decoder within an adaptive window and applying a Bernstein-based test to detect changes. It outputs the change point $t^*$, the affected subspace $D^*$, and a severity measure $\Delta$, with the ability to operate with PCA, kernel PCA, or autoencoder architectures and maintain constant-time online statistics. The method integrates change-point detection, subspace identification, and severity quantification into a single framework, and, on synthetic and real HD streams, yields up to 20% higher F1-score and higher precision than competitors while accurately locating subspaces and correlating severity with ground truth. This approach offers practical utility for real-time monitoring and adaptive learning in complex systems with evolving, high-dimensional data.

Abstract

Change detection is of fundamental importance when analyzing data streams. Detecting changes both quickly and accurately enables monitoring and prediction systems to react, e.g., by issuing an alarm or by updating a learning algorithm. However, detecting changes is challenging when observations are high-dimensional. In high-dimensional data, change detectors should not only be able to identify when changes happen, but also in which subspace they occur. Ideally, one should also quantify how severe they are. Our approach, ABCD, has these properties. ABCD learns an encoder-decoder model and monitors its accuracy over a window of adaptive size. ABCD derives a change score based on Bernstein's inequality to detect deviations in terms of accuracy, which indicate changes. Our experiments demonstrate that ABCD outperforms its best competitor by up to 20% in F1-score on average. It can also accurately estimate changes' subspace, together with a severity measure that correlates with the ground truth.

Adaptive Bernstein Change Detector for High-Dimensional Data Streams

TL;DR

ABCD provides a principled, online detector for changes in high-dimensional data streams by tracking the reconstruction loss of a learned encoder–decoder within an adaptive window and applying a Bernstein-based test to detect changes. It outputs the change point , the affected subspace , and a severity measure , with the ability to operate with PCA, kernel PCA, or autoencoder architectures and maintain constant-time online statistics. The method integrates change-point detection, subspace identification, and severity quantification into a single framework, and, on synthetic and real HD streams, yields up to 20% higher F1-score and higher precision than competitors while accurately locating subspaces and correlating severity with ground truth. This approach offers practical utility for real-time monitoring and adaptive learning in complex systems with evolving, high-dimensional data.

Abstract

Change detection is of fundamental importance when analyzing data streams. Detecting changes both quickly and accurately enables monitoring and prediction systems to react, e.g., by issuing an alarm or by updating a learning algorithm. However, detecting changes is challenging when observations are high-dimensional. In high-dimensional data, change detectors should not only be able to identify when changes happen, but also in which subspace they occur. Ideally, one should also quantify how severe they are. Our approach, ABCD, has these properties. ABCD learns an encoder-decoder model and monitors its accuracy over a window of adaptive size. ABCD derives a change score based on Bernstein's inequality to detect deviations in terms of accuracy, which indicate changes. Our experiments demonstrate that ABCD outperforms its best competitor by up to 20% in F1-score on average. It can also accurately estimate changes' subspace, together with a severity measure that correlates with the ground truth.
Paper Structure (25 sections, 1 theorem, 24 equations, 1 figure, 2 tables, 2 algorithms)

This paper contains 25 sections, 1 theorem, 24 equations, 1 figure, 2 tables, 2 algorithms.

Key Result

Theorem 1

Given two independent samples $X_1,X_2$ of size $n_1$ and $n_2$ from two random variables with unknown expected values $\mu_1, \mu_2$ and variances $\sigma^2_1, \sigma^2_2$. Let $\hat{\mu}_1, \hat{\mu}_2$ denote the sample means and let $\lvert\mu_1-x_i\rvert < M$ for all $x_i\in X_1$ and $\lvert\mu

Figures (1)

  • Figure 1: Overview of ABCD.

Theorems & Definitions (6)

  • Example : Biofuel production
  • Definition 1: Change
  • Definition 2: Change subspace
  • Definition 3: Change severity
  • Theorem 1: Bound on $\Pr\left(\lvert\hat{\mu}_1 - \hat{\mu}_2\rvert \geq \epsilon\right)$
  • proof