3rd Place Solution for VisDA 2021 Challenge -- Universally Domain Adaptive Image Recognition

Haojin Liao; Xiaolin Song; Sicheng Zhao; Shanghang Zhang; Xiangyu Yue; Xingxu Yao; Yueming Zhang; Tengfei Xing; Pengfei Xu; Qiang Wang

3rd Place Solution for VisDA 2021 Challenge -- Universally Domain Adaptive Image Recognition

Haojin Liao, Xiaolin Song, Sicheng Zhao, Shanghang Zhang, Xiangyu Yue, Xingxu Yao, Yueming Zhang, Tengfei Xing, Pengfei Xu, Qiang Wang

TL;DR

This work tackles universal domain adaptation in VisDA 2021 by combining a Transformer-based VOLO backbone with OVANet-inspired open-set handling and an adversarial domain discriminator. Key contributions include integrating VOLO-D3 as the feature extractor, adopting Token Labeling with VOLO-compatible augmentations, expanding the near-negative open-set classifiers, and introducing a gradient-reversal domain discriminator to align source and target distributions for known classes. Through a two-stage training regime and 5-crop inference, the approach achieves strong UniDA performance, placing 3rd on the VisDA 2021 leaderboard with ACC $48.49\%$ and AUROC $70.8\%$, and illustrating substantial gains from the combined architectural and training enhancements. The results demonstrate the effectiveness of transformer-based feature representations and explicit distribution alignment for open-world domain adaptation in large-scale, multi-class settings.

Abstract

The Visual Domain Adaptation (VisDA) 2021 Challenge calls for unsupervised domain adaptation (UDA) methods that can deal with both input distribution shift and label set variance between the source and target domains. In this report, we introduce a universal domain adaptation (UniDA) method by aggregating several popular feature extraction and domain adaptation schemes. First, we utilize VOLO, a Transformer-based architecture with state-of-the-art performance in several visual tasks, as the backbone to extract effective feature representations. Second, we modify the open-set classifier of OVANet to recognize the unknown class with competitive accuracy and robustness. As shown in the leaderboard, our proposed UniDA method ranks the 3rd place with 48.49% ACC and 70.8% AUROC in the VisDA 2021 Challenge.

3rd Place Solution for VisDA 2021 Challenge -- Universally Domain Adaptive Image Recognition

TL;DR

Abstract

3rd Place Solution for VisDA 2021 Challenge -- Universally Domain Adaptive Image Recognition

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (1)