Collaborative Multi-source Domain Adaptation Through Optimal Transport
Omar Ghannou, Younès Bennani
TL;DR
This paper tackles unsupervised multi-source domain adaptation under strict data privacy constraints by integrating optimal transport-based data alignment with privacy-preserving collaborative learning. It proposes CMDA-OT, a two-phase framework where each source is independently transported to the target space using a Sinkhorn-regularized OT, followed by a central Federated Learning (FedAvg) stage that aggregates the $N$ source models without sharing raw data; a small target-domain validation subset provides pseudo-label guidance and dynamic weighting to refine both representations and aggregation. The approach preserves privacy since no source data is exposed and leverages target validation to adaptively weight sources and representations. Empirical results on Office-Caltech-10 and VLSC demonstrate significant and stable improvements over strong baselines, with statistical tests confirming the robustness of the gains across diverse target domains.
Abstract
Multi-source Domain Adaptation (MDA) seeks to adapt models trained on data from multiple labeled source domains to perform effectively on an unlabeled target domain data, assuming access to sources data. To address the challenges of model adaptation and data privacy, we introduce Collaborative MDA Through Optimal Transport (CMDA-OT), a novel framework consisting of two key phases. In the first phase, each source domain is independently adapted to the target domain using optimal transport methods. In the second phase, a centralized collaborative learning architecture is employed, which aggregates the N models from the N sources without accessing their data, thereby safeguarding privacy. During this process, the server leverages a small set of pseudo-labeled samples from the target domain, known as the target validation subset, to refine and guide the adaptation. This dual-phase approach not only improves model performance on the target domain but also addresses vital privacy challenges inherent in domain adaptation.
