Online Boosting Adaptive Learning under Concept Drift for Multistream Classification
En Yu, Jie Lu, Bin Zhang, Guangquan Zhang
TL;DR
This work tackles multistream classification under concept drift by modeling temporal correlations across multiple data streams and mitigating covariate shift between sources and the unlabeled target. It introduces Online Boosting Adaptive Learning (OBAL), a two-stage framework comprising AdaCOSA for covariate-shift alignment and dynamic inter-stream correlation learning, and an online phase that detects asynchronous drift using DDM and a Gaussian Mixture Model weighting scheme. The approach yields an ensemble that reweights source contributions based on their target relevance, and triggers reinitialization when drift affects the target stream. Empirical results on synthetic and real-world datasets show OBAL achieving state-of-the-art accuracy, robustness to varying numbers of sources, and competitive runtime, highlighting its practical utility for adaptive learning in dynamic multistream environments.
Abstract
Multistream classification poses significant challenges due to the necessity for rapid adaptation in dynamic streaming processes with concept drift. Despite the growing research outcomes in this area, there has been a notable oversight regarding the temporal dynamic relationships between these streams, leading to the issue of negative transfer arising from irrelevant data. In this paper, we propose a novel Online Boosting Adaptive Learning (OBAL) method that effectively addresses this limitation by adaptively learning the dynamic correlation among different streams. Specifically, OBAL operates in a dual-phase mechanism, in the first of which we design an Adaptive COvariate Shift Adaptation (AdaCOSA) algorithm to construct an initialized ensemble model using archived data from various source streams, thus mitigating the covariate shift while learning the dynamic correlations via an adaptive re-weighting strategy. During the online process, we employ a Gaussian Mixture Model-based weighting mechanism, which is seamlessly integrated with the acquired correlations via AdaCOSA to effectively handle asynchronous drift. This approach significantly improves the predictive performance and stability of the target stream. We conduct comprehensive experiments on several synthetic and real-world data streams, encompassing various drifting scenarios and types. The results clearly demonstrate that OBAL achieves remarkable advancements in addressing multistream classification problems by effectively leveraging positive knowledge derived from multiple sources.
