Table of Contents
Fetching ...

Multiple testing in multi-stream sequential change detection

Sanjit Dandapanthula, Aaditya Ramdas

TL;DR

This work addresses the challenge of detecting changes across many data streams while balancing timely detection (finite ARL) with rigorous control of Type I errors. It introduces the error over patience (EOP) as a robust, anytime-valid metric that caps the false-positive rate relative to the time available for decision-making, thereby reconciling ARL constraints with cross-stream error control. The authors develop a family of e-detector–based procedures (e-d-BH, e-d-Bonferroni, e-d-Holm, and e-d-GNT) that provably bound EOP for FDR, PFER, FWER, and GER under very general dependence structures, with uniform guarantees across all stopping times; these methods also yield universal error control when ARL constraints are relaxed. The paper further discusses dependence subtleties, provides illustrative simulations (Gaussian mean changes, nonparametric symmetry, and conformal testing), and highlights piggybacking effects and practical considerations. Overall, the proposed framework extends classical multiple testing ideas to sequential, multi-stream change detection, offering principled, anytime-valid guarantees that are applicable in composite and dependent settings with broad practical impact for monitoring systems and online decision-making.

Abstract

Multi-stream sequential change detection involves simultaneously monitoring many streams of data and trying to detect when their distributions change, if at all. Here, we theoretically study multiple testing issues that arise from detecting changes in many streams. We point out that any algorithm with finite average run length (ARL) must have a trivial worst-case false detection rate (FDR), family-wise error rate (FWER), per-family error rate (PFER), and global error rate (GER); thus, any attempt to control these Type I error metrics is fundamentally in conflict with the desire for a finite ARL (which is typically necessary in order to have a small detection delay). One of our contributions is to define a new class of metrics which can be controlled, called error over patience (EOP). We propose algorithms that combine the recent e-detector framework (which generalizes the Shiryaev-Roberts and CUSUM methods) with the recent e-Benjamini-Hochberg procedure and e-Bonferroni procedures. We prove that these algorithms control the EOP at any desired level under very general dependence structures on the data within and across the streams. In fact, we prove a more general error control that holds uniformly over all stopping times and provides a smooth trade-off between the conflicting metrics. Additionally, if finiteness of the ARL is forfeited, we show that our algorithms control the worst-case Type I error.

Multiple testing in multi-stream sequential change detection

TL;DR

This work addresses the challenge of detecting changes across many data streams while balancing timely detection (finite ARL) with rigorous control of Type I errors. It introduces the error over patience (EOP) as a robust, anytime-valid metric that caps the false-positive rate relative to the time available for decision-making, thereby reconciling ARL constraints with cross-stream error control. The authors develop a family of e-detector–based procedures (e-d-BH, e-d-Bonferroni, e-d-Holm, and e-d-GNT) that provably bound EOP for FDR, PFER, FWER, and GER under very general dependence structures, with uniform guarantees across all stopping times; these methods also yield universal error control when ARL constraints are relaxed. The paper further discusses dependence subtleties, provides illustrative simulations (Gaussian mean changes, nonparametric symmetry, and conformal testing), and highlights piggybacking effects and practical considerations. Overall, the proposed framework extends classical multiple testing ideas to sequential, multi-stream change detection, offering principled, anytime-valid guarantees that are applicable in composite and dependent settings with broad practical impact for monitoring systems and online decision-making.

Abstract

Multi-stream sequential change detection involves simultaneously monitoring many streams of data and trying to detect when their distributions change, if at all. Here, we theoretically study multiple testing issues that arise from detecting changes in many streams. We point out that any algorithm with finite average run length (ARL) must have a trivial worst-case false detection rate (FDR), family-wise error rate (FWER), per-family error rate (PFER), and global error rate (GER); thus, any attempt to control these Type I error metrics is fundamentally in conflict with the desire for a finite ARL (which is typically necessary in order to have a small detection delay). One of our contributions is to define a new class of metrics which can be controlled, called error over patience (EOP). We propose algorithms that combine the recent e-detector framework (which generalizes the Shiryaev-Roberts and CUSUM methods) with the recent e-Benjamini-Hochberg procedure and e-Bonferroni procedures. We prove that these algorithms control the EOP at any desired level under very general dependence structures on the data within and across the streams. In fact, we prove a more general error control that holds uniformly over all stopping times and provides a smooth trade-off between the conflicting metrics. Additionally, if finiteness of the ARL is forfeited, we show that our algorithms control the worst-case Type I error.
Paper Structure (42 sections, 24 theorems, 128 equations, 15 figures, 4 algorithms)

This paper contains 42 sections, 24 theorems, 128 equations, 15 figures, 4 algorithms.

Key Result

Theorem 4.1

Consider a multi-stream change monitoring algorithm $\varphi$ for which $\mathrm{ARL}_1(\varphi) = \mathbb{E}_\infty[\tau^*_1(\varphi)] < \infty$. If $\mathcal{T}$ denotes the set of all stopping times with respect to $\mathcal{F}$, then we have:

Figures (15)

  • Figure 1: SR e-detectors over time for detecting a Gaussian mean change at time 200 (y-axis is on a logarithmic scale).
  • Figure 2: FDR of the naive algorithm and e-d-BH for Gaussian mean change.
  • Figure 3: FWER of the naive algorithm and e-d-Holm for Gaussian mean change (note the different y-axis scales).
  • Figure 4: PFER of the naive algorithm and e-d-Bonferroni for Gaussian mean change ($\beta_t = \beta = 10$).
  • Figure 5: Mean detections of e-d-BH, e-d-Bonferroni, and the naive algorithm over time for a Gaussian mean change (with varying signal strength).
  • ...and 10 more figures

Theorems & Definitions (62)

  • Definition 2.1: Stopping times
  • Definition 2.2: Sequential change monitoring algorithm
  • Definition 2.3: Sequential change detection algorithm
  • Definition 2.4: Global null
  • Definition 2.5: Average run length ($\mathrm{ARL}_\eta(\varphi)$)
  • Definition 2.6: Probability of false alarm ($\mathrm{PFA}(\varphi)$)
  • Definition 2.7: False detection rate at time $t$ ($\mathrm{FDR}(\varphi, \xi, t)$)
  • Definition 2.8: Family-wise error rate at time $t$ ($\mathrm{FWER}(\varphi, \xi, t)$)
  • Definition 2.9: Per-family error rate at time $t$ ($\mathrm{PFER}(\varphi, \xi, t)$)
  • Definition 2.10: e-processes
  • ...and 52 more