Unsupervised Concept Drift Detection based on Parallel Activations of Neural Network
Joanna Komorniczak, Paweł Ksieniewicz
TL;DR
The paper tackles concept drift in data streams under limited label availability by introducing the Parallel Activations Drift Detector (PADD), an unsupervised method that uses activations from a fixed, randomly initialized neural network to detect distribution shifts. Drift signaling is based on $r$ replications of a two-sample $t$-test across $e$ outputs, with drift declared when the number of significant tests $a$ exceeds $\theta\cdot e\cdot r$, after which the historical activation buffer is cleared. The authors validate PADD on synthetic streams with varying drift dynamics and feature counts, showing competitive performance against both unsupervised and supervised detectors while providing replicable experiments and open-source code. The work contributes a practical, label-free drift detector and outlines future directions toward non-parametric testing and modeling dependencies among NN outputs. Overall, PADD expands the toolkit for robust drift detection in data streams where labeling is scarce or delayed, enabling more reliable unsupervised monitoring in real-time AI systems.
Abstract
Practical applications of artificial intelligence increasingly often have to deal with the streaming properties of real data, which, considering the time factor, are subject to phenomena such as periodicity and more or less chaotic degeneration - resulting directly in the concept drifts. The modern concept drift detectors almost always assume immediate access to labels, which due to their cost, limited availability and possible delay has been shown to be unrealistic. This work proposes an unsupervised Parallel Activations Drift Detector, utilizing the outputs of an untrained neural network, presenting its key design elements, intuitions about processing properties, and a pool of computer experiments demonstrating its competitiveness with state-of-the-art methods.
