Higher-criticism for sparse multi-stream change-point detection

Tingnan Gong; Alon Kipnis; Yao Xie

Higher-criticism for sparse multi-stream change-point detection

Tingnan Gong, Alon Kipnis, Yao Xie

TL;DR

This work introduces a higher-criticism (HC) based approach for sparse multi-stream sequential change-point detection, combining per-stream change tests (e.g., LR or GLR) via HC to achieve rapid detection when only a small subset of streams is affected. The authors formulate a sparse heteroscedastic normal model and derive an information-theoretic lower bound on detection delay; they prove that HC-based detection attains this bound, with the delay converging to $oldsymbol{ riangle}^*(r,eta, ext{var})$, and provide a phase-transition view between undetectable and detectable regimes. The analysis accommodates unknown sparsity, mean, and variance changes and demonstrates robustness to heteroscedasticity; extensive simulations show HC often outperforms competing methods in sparse settings and provides practical stream localization. The work bridges offline sparse signal detection techniques with online sequential detection, offering a theoretically grounded, adaptable framework for real-time multi-stream anomaly detection with clear implications for high-dimensional monitoring and fault localization.

Abstract

We study a statistical procedure based on higher criticism (HC) to address the sparse multi-stream quickest change-point detection problem. Namely, we aim to detect a potential change in the distribution of multiple data streams at some unknown time. If a change occurs, only a few streams are affected, whereas the identity of the affected streams is unknown. The HC-based procedure involves testing for a change point in individual streams and combining multiple tests using higher criticism. Relying on HC thresholding, the procedure also indicates a set of streams suspected to be affected by the change. We provide a theoretical analysis under a sparse heteroscedastic normal change-point model. We establish an information-theoretic detection delay lower bound when individual tests are based on the likelihood ratio or the generalized likelihood ratio statistics and show that the delay of the HC-based method converges in distribution to this bound. In the special case of constant variance, our bound coincides with known results in (Chan, 2017). We demonstrate the effectiveness of the HC-based method compared to other methods in detecting sparse changes through extensive numerical evaluations.

Higher-criticism for sparse multi-stream change-point detection

TL;DR

, and provide a phase-transition view between undetectable and detectable regimes. The analysis accommodates unknown sparsity, mean, and variance changes and demonstrates robustness to heteroscedasticity; extensive simulations show HC often outperforms competing methods in sparse settings and provides practical stream localization. The work bridges offline sparse signal detection techniques with online sequential detection, offering a theoretically grounded, adaptable framework for real-time multi-stream anomaly detection with clear implications for high-dimensional monitoring and fault localization.

Abstract

Paper Structure (31 sections, 16 theorems, 97 equations, 7 figures, 5 tables)

This paper contains 31 sections, 16 theorems, 97 equations, 7 figures, 5 tables.

Introduction
Method
Setup
Change-point testing in individual streams
Detection using Higher Criticism
Asymptotic detection performance in normal data
Problem formulation
Testing individual streams
Asymptotic detection delay
Discussions
Information-theoretic delay and phase transition
Relation to previous results
Heteroscedastic change
Numerical experiments
Trajectory illustration
...and 16 more sections

Key Result

Theorem 3.1

Consider the change-point detection problem eq:normal_problem and P-values where $\square \in \{\mathsf{LR}, \mathsf{GLR}\}$ as defined by eq:YLR_def-eq:p-val_def_LR or eq:YGLR_def-eq:p-val_def_GLR. For a given test statistic $U_t$ based on $\pi_{1,t},\ldots,\pi_{N,t}$, consider a detection procedure that stops at time $T_U$ as soon as $U_t$ exceeds $b_t^{(N)}$. Consider a

Figures (7)

Figure 1: HC detection statistic $\mathrm{HC}^*_t$ of \ref{['eq:HC_t_star']} before and after the change that occurs at time $t=2000$. Each of $N=500$ streams is distributed as $\mathcal{N}(0,1)$ before the change, after the change, streams in $I$ are distributed as $\mathcal{N}(\mu_r(N),1)$. P-values of each stream at each time are computed using the CUSUM statistic \ref{['eq:YLR_def']} under the null distribution (before the change). (a) Trajectories of $\mathrm{HC}^*_t$ over $t$, when there is no change, and when there is a change at 2000, post-change distribution being normally distributed with $\mu_r(N) = 0.35$ and $|I|=4$. (b) Histograms of 1000 simulated values of $\mathrm{HC}_t^*$ at time $t=1,000$ (before the change) and $t=2,100$ (after the change), with $\mu_r(N) = 0.79$ and $|I|=22$.
Figure 2: Curves describing the asymptotic theoretical detection delay $\Delta^*(r,\beta,\sigma)$ of \ref{['eq:delta_star_def']} in multistream normal data with a heteroscedastic sparse change for detection based on LR or GLR statistics (Theorems \ref{['thm:main_impossible']} and \ref{['thm:HC']}). (a): $\Delta^*(r,\beta,\sigma)$ versus $\beta$ for several values of $\sigma$. (b): $\Delta^*(r,\beta,\sigma)$ versus $r$ for several values of $\beta$. The detection delay increases with larger $\beta$ (higher sparsity) and decreases with larger $r$ (stronger change magnitude) and larger $\sigma^2$ (larger post-change variance).
Figure 3: Empirical and fitted survival functions of $T_\mathrm{HC}$ under the null, illustrated for the threshold $b_t = 5$. Data is generated under the null hypothesis with $N=20,000$ across 500 repetitions.
Figure 4: Detection delays in sparse normal mean-shift using the HC procedure. Approximately $N^{1-\beta}$ streams undergo a change from $\mathcal{N}(0,1)$ to $\mathcal{N}(\sqrt{2r\log(N)},1)$, with $N = 5,000$ and a grid of $(r,\beta)$ for change magnitude and sparsity. The target ARL is set to be $5,000$. P-values for each stream are computed using the CUSUM statistic. Left: Average detection delay across 500 Monte Carlo trials for each $(r,\beta)$ configuration. Right: Histogram of detection delay for a single $(r,\beta)$ configuration, with the dashed vertical line indicating the corresponding EDD.
Figure 5: Convergence of detection delay $\Delta = T_{\mathrm{HC}}-\tau$ to its theoretical asymptotic value as the number of streams $N$ increases. Data is generated according to \ref{['eq:data_model']}, with P-values for each stream computed using the CUSUM statistic. The thresholds are tuned to meet a target ARL $5000$, and results are based on 500 Monte Carlo repetitions. (a)--(c): Expected detection delay $\mathbb{E}\left[ \Delta \mid \Delta \geq T_{\mathrm{HC}} \geq \tau\right]$ versus $N$ for $\beta = 0.7$ and $r\in\{0.05,0.1,0.2\}$. Dashed horizontal lines indicate the theoretical asymptotic delay $\Delta^*(r,\beta,\sigma)$. (d)--(f): Histograms of $\Delta-\Delta^\ast(r,\beta, 1)$ for fixed $(r,\beta) = (0.1, 0.7)$ and $N\in\{100,1000,16000\}$; dashed vertical lines represent EDD.
...and 2 more figures

Theorems & Definitions (29)

Theorem 3.1
Theorem 3.2
Corollary 3.3
Theorem 5.1
Corollary 5.2
Corollary 5.3
Theorem 5.4
proof : Proof of Theorem \ref{['thm:main_impossible']}
proof : Proof of Theorem \ref{['thm:HC']}
Lemma A.1
...and 19 more

Higher-criticism for sparse multi-stream change-point detection

TL;DR

Abstract

Higher-criticism for sparse multi-stream change-point detection

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (29)