Adversarial Attacks for Drift Detection
Fabian Hinder, Valerie Vaquet, Barbara Hammer
TL;DR
This work formalizes concept drift as time-varying data distributions and reveals drift adversarials—drifts that escape detection by common drift detectors. It distinguishes metric-based and window-based attacks, then develops a rigorous framework for two-window detectors, proving that drift can be undetected unless the adversarial set $\textnormal{Adv}(A)$ is empty. The authors provide both limiting-case and finite-sample constructions using window representations $\mathbf{W}_n$ and sampling vectors to generate undetectable drift, and they validate the theory with synthetic experiments and a water-network case study. The results highlight a significant vulnerability in many drift detectors and suggest detector-combining or problem-tailored detector design as avenues for improved robustness in critical monitoring applications.
Abstract
Concept drift refers to the change of data distributions over time. While drift poses a challenge for learning models, requiring their continual adaption, it is also relevant in system monitoring to detect malfunctions, system failures, and unexpected behavior. In the latter case, the robust and reliable detection of drifts is imperative. This work studies the shortcomings of commonly used drift detection schemes. We show how to construct data streams that are drifting without being detected. We refer to those as drift adversarials. In particular, we compute all possible adversairals for common detection schemes and underpin our theoretical findings with empirical evaluations.
