More is Better: Deep Domain Adaptation with Multiple Sources
Sicheng Zhao, Hui Chen, Hu Huang, Pengfei Xu, Guiguang Ding
TL;DR
By modeling multiple sources with distributions $p_i(\mathbf{x},\mathbf{y})$ and a target distribution $P_T(\mathbf{x},\mathbf{y})$, the survey organizes deep MDA methods into latent space transformation, intermediate domain generation, and task classifier refinement. It discusses matching strategies and special settings such as federated and source-free MDA, providing datasets, benchmarks, and practical guidelines. Empirically, domain alignment improves target accuracy over source-only baselines across multiple datasets, though substantial gaps to oracle performance remain, indicating room for improvement. The work offers a roadmap for designing robust MDA systems and outlines promising directions such as multi-modal MDA, test-time adaptation, and theory-informed analysis.
Abstract
In many practical applications, it is often difficult and expensive to obtain large-scale labeled data to train state-of-the-art deep neural networks. Therefore, transferring the learned knowledge from a separate, labeled source domain to an unlabeled or sparsely labeled target domain becomes an appealing alternative. However, direct transfer often results in significant performance decay due to domain shift. Domain adaptation (DA) aims to address this problem by aligning the distributions between the source and target domains. Multi-source domain adaptation (MDA) is a powerful and practical extension in which the labeled data may be collected from multiple sources with different distributions. In this survey, we first define various MDA strategies. Then we systematically summarize and compare modern MDA methods in the deep learning era from different perspectives, followed by commonly used datasets and a brief benchmark. Finally, we discuss future research directions for MDA that are worth investigating.
