Collaborative Multi-source Domain Adaptation Through Optimal Transport

Omar Ghannou; Younès Bennani

Collaborative Multi-source Domain Adaptation Through Optimal Transport

Omar Ghannou, Younès Bennani

TL;DR

This paper tackles unsupervised multi-source domain adaptation under strict data privacy constraints by integrating optimal transport-based data alignment with privacy-preserving collaborative learning. It proposes CMDA-OT, a two-phase framework where each source is independently transported to the target space using a Sinkhorn-regularized OT, followed by a central Federated Learning (FedAvg) stage that aggregates the $N$ source models without sharing raw data; a small target-domain validation subset provides pseudo-label guidance and dynamic weighting to refine both representations and aggregation. The approach preserves privacy since no source data is exposed and leverages target validation to adaptively weight sources and representations. Empirical results on Office-Caltech-10 and VLSC demonstrate significant and stable improvements over strong baselines, with statistical tests confirming the robustness of the gains across diverse target domains.

Abstract

Multi-source Domain Adaptation (MDA) seeks to adapt models trained on data from multiple labeled source domains to perform effectively on an unlabeled target domain data, assuming access to sources data. To address the challenges of model adaptation and data privacy, we introduce Collaborative MDA Through Optimal Transport (CMDA-OT), a novel framework consisting of two key phases. In the first phase, each source domain is independently adapted to the target domain using optimal transport methods. In the second phase, a centralized collaborative learning architecture is employed, which aggregates the N models from the N sources without accessing their data, thereby safeguarding privacy. During this process, the server leverages a small set of pseudo-labeled samples from the target domain, known as the target validation subset, to refine and guide the adaptation. This dual-phase approach not only improves model performance on the target domain but also addresses vital privacy challenges inherent in domain adaptation.

Collaborative Multi-source Domain Adaptation Through Optimal Transport

TL;DR

source models without sharing raw data; a small target-domain validation subset provides pseudo-label guidance and dynamic weighting to refine both representations and aggregation. The approach preserves privacy since no source data is exposed and leverages target validation to adaptively weight sources and representations. Empirical results on Office-Caltech-10 and VLSC demonstrate significant and stable improvements over strong baselines, with statistical tests confirming the robustness of the gains across diverse target domains.

Abstract

Paper Structure (14 sections, 7 equations, 7 figures, 5 tables)

This paper contains 14 sections, 7 equations, 7 figures, 5 tables.

Introduction
Fundamental background of the proposed approach
Unsupervised Multi-Source Domain Adaptation
Optimal Transport
Collaborative Learning with Federated Learning as Instance
Our Approach
Overall Framework of CMDA-OT
Pseudo-Labeling the Target Domain Validation Data
On-Client Testing and Server Weights Aggregation
Collaborative Learning Framework : Adaptability and Scalability
Experiments
Datasets
Hyperparameters Tuning
Results and Discussion

Figures (7)

Figure 1: Centralized Federated Learning
Figure 2: Decentralized Federated Learning
Figure 3: Overall Framework of CMDA-OT
Figure 4: HOT correspondence Matrix
Figure 5: Friedman and Nemenyi test for comparing multiple approaches over Office-Caltech10 data sets: Approaches are ordered from right (the best) to left (the worst)
...and 2 more figures

Collaborative Multi-source Domain Adaptation Through Optimal Transport

TL;DR

Abstract

Collaborative Multi-source Domain Adaptation Through Optimal Transport

Authors

TL;DR

Abstract

Table of Contents

Figures (7)