AiGAS-dEVL-RC: An Adaptive Growing Neural Gas Model for Recurrently Drifting Unsupervised Data Streams
Maria Arostegi, Miren Nekane Bilbao, Jesus L. Lobo, Javier Del Ser
TL;DR
The paper tackles online unsupervised learning under Extreme Verification Latency (EVL) and abrupt recurrent concept drift by proposing AiGAS-dEVL-RC, a memory-augmented Growing Neural Gas (GNG) model. It introduces a four-stage batch workflow (characterize, retrieve, predict, store) and a memory $m{igmathcal{M}}$ of past concepts, using projection-based drift estimation and IoU-based recurrence detection via $oldsymbol{α}$-shapes to retrieve relevant past distributions. The approach demonstrates competitive performance on non-recurrent streams and superior resilience to abrupt recurrent drifts through memory retrieval, while acknowledging the computational/memory costs of GNG and the need for configuration automation. Overall, AiGAS-dEVL-RC offers a robust online learning solution for unsupervised drifting data streams with recurrency, and lays groundwork for multi-modal extensions and adaptive memory management in real-time systems.
Abstract
Concept drift and extreme verification latency pose significant challenges in data stream learning, particularly when dealing with recurring concept changes in dynamic environments. This work introduces a novel method based on the Growing Neural Gas (GNG) algorithm, designed to effectively handle abrupt recurrent drifts while adapting to incrementally evolving data distributions (incremental drifts). Leveraging the self-organizing and topological adaptability of GNG, the proposed approach maintains a compact yet informative memory structure, allowing it to efficiently store and retrieve knowledge of past or recurring concepts, even under conditions of delayed or sparse stream supervision. Our experiments highlight the superiority of our approach over existing data stream learning methods designed to cope with incremental non-stationarities and verification latency, demonstrating its ability to quickly adapt to new drifts, robustly manage recurring patterns, and maintain high predictive accuracy with a minimal memory footprint. Unlike other techniques that fail to leverage recurring knowledge, our proposed approach is proven to be a robust and efficient online learning solution for unsupervised drifting data flows.
