The Cost of Learning under Multiple Change Points
Tomer Gafni, Garud Iyengar, Assaf Zeevi
TL;DR
This paper studies online learning in non-stationary environments with multiple abrupt change points and identifies endogenous confounding as a key challenge that undermines detection when past data are not discarded. It proposes Anytime Tracking CUSUM (ATC), a horizon-free algorithm that selectively restarts to balance rapid adaptation to large shifts with stability during stationary periods. The authors prove a non-asymptotic upper bound on dynamic regret of order $O(\sigma^2 (S+1) \log T)$ and establish a matching information-theoretic lower bound of order $\Omega(\sigma^2 (S+1) \log(T/(S+1)))$, showing near-minimax optimality; they also quantify the confounding effect via SNR degradation and validate results on synthetic and NAB data. The work contributes a principled framework for learning under multiple change points, with implications for real-time demand tracking and adaptive control, and opens avenues for extending to higher dimensions and robustness to variance misspecification.
Abstract
We consider an online learning problem in environments with multiple change points. In contrast to the single change point problem that is widely studied using classical "high confidence" detection schemes, the multiple change point environment presents new learning-theoretic and algorithmic challenges. Specifically, we show that classical methods may exhibit catastrophic failure (high regret) due to a phenomenon we refer to as endogenous confounding. To overcome this, we propose a new class of learning algorithms dubbed Anytime Tracking CUSUM (ATC). These are horizon-free online algorithms that implement a selective detection principle, balancing the need to ignore "small" (hard-to-detect) shifts, while reacting "quickly" to significant ones. We prove that the performance of a properly tuned ATC algorithm is nearly minimax-optimal; its regret is guaranteed to closely match a novel information-theoretic lower bound on the achievable performance of any learning algorithm in the multiple change point problem. Experiments on synthetic as well as real-world data validate the aforementioned theoretical findings.
