AiGAS-dEVL: An Adaptive Incremental Neural Gas Model for Drifting Data Streams under Extreme Verification Latency
Maria Arostegi, Miren Nekane Bilbao, Jesus L. Lobo, Javier Del Ser
TL;DR
AiGAS-dEVL tackles drifting data streams with extreme verification latency by maintaining a Growing Neural Gas map of evolving concepts learned during an initial labeled phase and using projection-guided alignment to predict unlabeled batches. The method couples unsupervised prototype tracking with semi-supervised labeling of emergent nodes, employing a minimum-cost node matching and a rigid transformation to align consecutive concept maps for future predictions. Across a benchmark of synthetic and real EVL datasets, AiGAS-dEVL demonstrates competitive or superior performance in prequential error and macro $F_1$, while offering a simple, interpretable instance-based adaptation strategy. The work advances drift-aware streaming by providing a flexible template that can accommodate different drift dynamics and labeling constraints, with future work targeting non-rigid drift modeling and memory-aware label propagation.
Abstract
The ever-growing speed at which data are generated nowadays, together with the substantial cost of labeling processes cause Machine Learning models to face scenarios in which data are partially labeled. The extreme case where such a supervision is indefinitely unavailable is referred to as extreme verification latency. On the other hand, in streaming setups data flows are affected by exogenous factors that yield non-stationarities in the patterns (concept drift), compelling models learned incrementally from the data streams to adapt their modeled knowledge to the concepts within the stream. In this work we address the casuistry in which these two conditions occur together, by which adaptation mechanisms to accommodate drifts within the stream are challenged by the lack of supervision, requiring further mechanisms to track the evolution of concepts in the absence of verification. To this end we propose a novel approach, AiGAS-dEVL (Adaptive Incremental neural GAS model for drifting Streams under Extreme Verification Latency), which relies on growing neural gas to characterize the distributions of all concepts detected within the stream over time. Our approach exposes that the online analysis of the behavior of these prototypical points over time facilitates the definition of the evolution of concepts in the feature space, the detection of changes in their behavior, and the design of adaptation policies to mitigate the effect of such changes in the model. We assess the performance of AiGAS-dEVL over several synthetic datasets, comparing it to that of state-of-the-art approaches proposed in the recent past to tackle this stream learning setup. Our results reveal that AiGAS-dEVL performs competitively with respect to the rest of baselines, exhibiting a superior adaptability over several datasets in the benchmark while ensuring a simple and interpretable instance-based adaptation strategy.
