Table of Contents
Fetching ...

Neural Network-Based Change Point Detection for Large-Scale Time-Evolving Data

Jialiang Geng, George Michailidis

TL;DR

This work develops a nonparametric, neural-network-based framework for offline change point detection in high-dimensional, time-evolving data. It employs a two-step procedure that trains a feed-forward NN on a moving window to form an error criterion $E(t)$ and then detects change points via moving-detection windows and a threshold, yielding consistent estimates of both the number and locations of multiple CPs. The authors provide rigorous theoretical guarantees under various data regimes (independent, subgaussian, dependent), showing that with appropriate windowing and signal strength, the estimator recovers $N$ change points with bounded localization error and high probability. Empirical results on synthetic and real-world datasets demonstrate robust performance and competitive or superior detection accuracy relative to established CPD methods. The approach offers a flexible, scalable framework for CPD in large-scale, nonparametric time-evolving settings, with clear directions for extending to RNN/CNN/LSTM architectures and online deployment.

Abstract

The paper studies the problem of detecting and locating change points in multivariate time-evolving data. The problem has a long history in statistics and signal processing and various algorithms have been developed primarily for simple parametric models. In this work, we focus on modeling the data through feed-forward neural networks and develop a detection strategy based on the following two-step procedure. In the first step, the neural network is trained over a prespecified window of the data, and its test error function is calibrated over another prespecified window. Then, the test error function is used over a moving window to identify the change point. Once a change point is detected, the procedure involving these two steps is repeated until all change points are identified. The proposed strategy yields consistent estimates for both the number and the locations of the change points under temporal dependence of the data-generating process. The effectiveness of the proposed strategy is illustrated on synthetic data sets that provide insights on how to select in practice tuning parameters of the algorithm and in real data sets. Finally, we note that although the detection strategy is general and can work with different neural network architectures, the theoretical guarantees provided are specific to feed-forward neural architectures.

Neural Network-Based Change Point Detection for Large-Scale Time-Evolving Data

TL;DR

This work develops a nonparametric, neural-network-based framework for offline change point detection in high-dimensional, time-evolving data. It employs a two-step procedure that trains a feed-forward NN on a moving window to form an error criterion and then detects change points via moving-detection windows and a threshold, yielding consistent estimates of both the number and locations of multiple CPs. The authors provide rigorous theoretical guarantees under various data regimes (independent, subgaussian, dependent), showing that with appropriate windowing and signal strength, the estimator recovers change points with bounded localization error and high probability. Empirical results on synthetic and real-world datasets demonstrate robust performance and competitive or superior detection accuracy relative to established CPD methods. The approach offers a flexible, scalable framework for CPD in large-scale, nonparametric time-evolving settings, with clear directions for extending to RNN/CNN/LSTM architectures and online deployment.

Abstract

The paper studies the problem of detecting and locating change points in multivariate time-evolving data. The problem has a long history in statistics and signal processing and various algorithms have been developed primarily for simple parametric models. In this work, we focus on modeling the data through feed-forward neural networks and develop a detection strategy based on the following two-step procedure. In the first step, the neural network is trained over a prespecified window of the data, and its test error function is calibrated over another prespecified window. Then, the test error function is used over a moving window to identify the change point. Once a change point is detected, the procedure involving these two steps is repeated until all change points are identified. The proposed strategy yields consistent estimates for both the number and the locations of the change points under temporal dependence of the data-generating process. The effectiveness of the proposed strategy is illustrated on synthetic data sets that provide insights on how to select in practice tuning parameters of the algorithm and in real data sets. Finally, we note that although the detection strategy is general and can work with different neural network architectures, the theoretical guarantees provided are specific to feed-forward neural architectures.

Paper Structure

This paper contains 23 sections, 9 theorems, 59 equations, 5 figures, 5 tables, 2 algorithms.

Key Result

Theorem 4.1

Consider Single change point cases in equation eq:problem and assumption asp:signal and asp:smoothness hold. Further, consider the following Error criterion: $E(t)=\sum_{i=t}^{t+T_0}||\hat{f}(X_i)-Y_i||^2$, with $T_0$ being the sample size and test size with the neural network $\hat{f}$ trained on $ with probability for some constant $C_l$ based on l if $\mathbb{E}[||f_i(X)||^l]<\infty, l\geq 2$

Figures (5)

  • Figure 1: Example for generating E(t)
  • Figure 2: Example for detection process
  • Figure 3: Performance metrics vs $\sigma$
  • Figure 4: Real data
  • Figure 5: QQplot

Theorems & Definitions (9)

  • Theorem 4.1
  • Theorem 4.2
  • Theorem 4.3
  • Theorem 4.4
  • Lemma 10.1
  • Lemma 10.2
  • Lemma 10.3
  • Lemma 10.4
  • Lemma 10.5