UDA-Bench: Revisiting Common Assumptions in Unsupervised Domain Adaptation Using a Standardized Framework

Tarun Kalluri; Sreyas Ravichandran; Manmohan Chandraker

UDA-Bench: Revisiting Common Assumptions in Unsupervised Domain Adaptation Using a Standardized Framework

Tarun Kalluri, Sreyas Ravichandran, Manmohan Chandraker

TL;DR

The paper introduces UDA-Bench, a standardized PyTorch framework for fair, cross-method evaluation of unsupervised domain adaptation (UDA). Through a large-scale empirical study, it shows that adaptation gains shrink with stronger backbones, unlabeled target data provides diminishing returns, and pre-training data significantly shapes downstream adaptation in both supervised and self-supervised settings. It analyzes backbone architectures, unlabeled data volume, and pre-training data across diverse datasets, revealing that newer vision-transformer backbones improve cross-domain robustness but often reduce the relative benefits of UDA methods, while in-task pre-training yields substantial improvements. The work challenges some conventional beliefs about unlabeled data efficiency and underscores the need for standardized benchmarks, providing practical guidance for researchers and practitioners and contributing open-source resources for future UDA research.

Abstract

In this work, we take a deeper look into the diverse factors that influence the efficacy of modern unsupervised domain adaptation (UDA) methods using a large-scale, controlled empirical study. To facilitate our analysis, we first develop UDA-Bench, a novel PyTorch framework that standardizes training and evaluation for domain adaptation enabling fair comparisons across several UDA methods. Using UDA-Bench, our comprehensive empirical study into the impact of backbone architectures, unlabeled data quantity, and pre-training datasets reveals that: (i) the benefits of adaptation methods diminish with advanced backbones, (ii) current methods underutilize unlabeled data, and (iii) pre-training data significantly affects downstream adaptation in both supervised and self-supervised settings. In the context of unsupervised adaptation, these observations uncover several novel and surprising properties, while scientifically validating several others that were often considered empirical heuristics or practitioner intuitions in the absence of a standardized training and evaluation framework. The UDA-Bench framework and trained models are publicly available at https://github.com/ViLab-UCSD/UDABench_ECCV2024.

UDA-Bench: Revisiting Common Assumptions in Unsupervised Domain Adaptation Using a Standardized Framework

TL;DR

Abstract

Paper Structure (15 sections, 38 figures, 5 tables)

This paper contains 15 sections, 38 figures, 5 tables.

Introduction
Related Works
Analysis Setup
Methodology and Evaluation
Which backbone architectures suit UDA best?
How much unlabeled data can UDA methods use?
Does pre-training data matter in UDA?
Conclusion
UDABench: Code Overview
Additional Results on DomainNet and OfficeHome
Results Using Additional UDA Methods
Source Labeled vs. Target Unlabeled Data
Results using TinyImageNet
Training Details
Unsupervised Pre-Training Network Details

Figures (38)

Figure 1: Backbone
Figure 2: Sample Efficiency of Target Data
Figure 3: Type of Pre-Training Data
Figure 5: Need for UDA-Bench. We illustrate the disparity between various codebases proposed for prior UDA methods by highlighting the different accuracy numbers obtained for a plain source only model. Computed without any adaptation, it should ideally match across implementations which is clearly not the case. To enable fair comparisons across UDA methods, we propose UDA-Bench, a new PyTorch framework to standardize training and evaluation across various methods.
Figure 6: DomainNet (Real$\rightarrow$Clipart)
...and 33 more figures

UDA-Bench: Revisiting Common Assumptions in Unsupervised Domain Adaptation Using a Standardized Framework

TL;DR

Abstract

UDA-Bench: Revisiting Common Assumptions in Unsupervised Domain Adaptation Using a Standardized Framework

Authors

TL;DR

Abstract

Table of Contents

Figures (38)