Table of Contents
Fetching ...

Collaborative Multi-source Domain Adaptation Through Optimal Transport

Omar Ghannou, Younès Bennani

TL;DR

This paper tackles unsupervised multi-source domain adaptation under strict data privacy constraints by integrating optimal transport-based data alignment with privacy-preserving collaborative learning. It proposes CMDA-OT, a two-phase framework where each source is independently transported to the target space using a Sinkhorn-regularized OT, followed by a central Federated Learning (FedAvg) stage that aggregates the $N$ source models without sharing raw data; a small target-domain validation subset provides pseudo-label guidance and dynamic weighting to refine both representations and aggregation. The approach preserves privacy since no source data is exposed and leverages target validation to adaptively weight sources and representations. Empirical results on Office-Caltech-10 and VLSC demonstrate significant and stable improvements over strong baselines, with statistical tests confirming the robustness of the gains across diverse target domains.

Abstract

Multi-source Domain Adaptation (MDA) seeks to adapt models trained on data from multiple labeled source domains to perform effectively on an unlabeled target domain data, assuming access to sources data. To address the challenges of model adaptation and data privacy, we introduce Collaborative MDA Through Optimal Transport (CMDA-OT), a novel framework consisting of two key phases. In the first phase, each source domain is independently adapted to the target domain using optimal transport methods. In the second phase, a centralized collaborative learning architecture is employed, which aggregates the N models from the N sources without accessing their data, thereby safeguarding privacy. During this process, the server leverages a small set of pseudo-labeled samples from the target domain, known as the target validation subset, to refine and guide the adaptation. This dual-phase approach not only improves model performance on the target domain but also addresses vital privacy challenges inherent in domain adaptation.

Collaborative Multi-source Domain Adaptation Through Optimal Transport

TL;DR

This paper tackles unsupervised multi-source domain adaptation under strict data privacy constraints by integrating optimal transport-based data alignment with privacy-preserving collaborative learning. It proposes CMDA-OT, a two-phase framework where each source is independently transported to the target space using a Sinkhorn-regularized OT, followed by a central Federated Learning (FedAvg) stage that aggregates the source models without sharing raw data; a small target-domain validation subset provides pseudo-label guidance and dynamic weighting to refine both representations and aggregation. The approach preserves privacy since no source data is exposed and leverages target validation to adaptively weight sources and representations. Empirical results on Office-Caltech-10 and VLSC demonstrate significant and stable improvements over strong baselines, with statistical tests confirming the robustness of the gains across diverse target domains.

Abstract

Multi-source Domain Adaptation (MDA) seeks to adapt models trained on data from multiple labeled source domains to perform effectively on an unlabeled target domain data, assuming access to sources data. To address the challenges of model adaptation and data privacy, we introduce Collaborative MDA Through Optimal Transport (CMDA-OT), a novel framework consisting of two key phases. In the first phase, each source domain is independently adapted to the target domain using optimal transport methods. In the second phase, a centralized collaborative learning architecture is employed, which aggregates the N models from the N sources without accessing their data, thereby safeguarding privacy. During this process, the server leverages a small set of pseudo-labeled samples from the target domain, known as the target validation subset, to refine and guide the adaptation. This dual-phase approach not only improves model performance on the target domain but also addresses vital privacy challenges inherent in domain adaptation.
Paper Structure (14 sections, 7 equations, 7 figures, 5 tables)

This paper contains 14 sections, 7 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Centralized Federated Learning
  • Figure 2: Decentralized Federated Learning
  • Figure 3: Overall Framework of CMDA-OT
  • Figure 4: HOT correspondence Matrix
  • Figure 5: Friedman and Nemenyi test for comparing multiple approaches over Office-Caltech10 data sets: Approaches are ordered from right (the best) to left (the worst)
  • ...and 2 more figures