Online Multi-Source Domain Adaptation through Gaussian Mixtures and Dataset Dictionary Learning
Eduardo Fernandes Montesuma, Stevan Le Stanc, Fred Ngolè Mboula
TL;DR
The paper tackles online multi-source domain adaptation where multiple heterogeneous sources must be aligned to a target domain that arrives as a data stream. It introduces an online Gaussian Mixture Modeling approach grounded in the Wasserstein geometry of Gaussian measures, and extends this with online DaDiL dictionary learning to memory-encode the target stream as a mixture of learned atoms. Key contributions include (i) an online GMM fitting and compression routine based on $W_2$ and Wasserstein barycenters, and (ii) a memory-enabled online dictionary learning framework that expresses target and sources as barycenters over a shared dictionary, enabling post-stream optimization. Empirical validation on the Tennessee Eastman Process benchmark demonstrates effective on-the-fly adaptation and memory-driven improvement after data streams end, suggesting practical impact for real-time fault diagnosis and similar online transfer tasks.
Abstract
This paper addresses the challenge of online multi-source domain adaptation (MSDA) in transfer learning, a scenario where one needs to adapt multiple, heterogeneous source domains towards a target domain that comes in a stream. We introduce a novel approach for the online fit of a Gaussian Mixture Model (GMM), based on the Wasserstein geometry of Gaussian measures. We build upon this method and recent developments in dataset dictionary learning for proposing a novel strategy in online MSDA. Experiments on the challenging Tennessee Eastman Process benchmark demonstrate that our approach is able to adapt \emph{on the fly} to the stream of target domain data. Furthermore, our online GMM serves as a memory, representing the whole stream of data.
