Test-Time Domain Adaptation by Learning Domain-Aware Batch Normalization

Yanan Wu; Zhixiang Chi; Yang Wang; Konstantinos N. Plataniotis; Songhe Feng

Test-Time Domain Adaptation by Learning Domain-Aware Batch Normalization

Yanan Wu, Zhixiang Chi, Yang Wang, Konstantinos N. Plataniotis, Songhe Feng

TL;DR

This work tackles test-time domain adaptation by decoupling label and domain knowledge, proposing Meta-Adaptive BN (MABN) that only updates the affine BN parameters while keeping source statistics fixed. It augments this BN-centric adaptation with a label-independent self-supervised auxiliary branch and a bi-level meta-learning framework to align the auxiliary SSL objective with the main task, enabling robust domain adaptation from few unlabeled target samples. Empirical results on five WILDS benchmarks and DomainNet show that MABN outperforms prior TT-DA methods such as ARM and Meta-DMoE, and ablations confirm the necessity of affine-only BN updates and meta-auxiliary training. The approach maintains the same inference cost as the base model and can be integrated with entropy-based TTA methods to further enhance performance, offering a practical and scalable solution for real-world domain shifts.

Abstract

Test-time domain adaptation aims to adapt the model trained on source domains to unseen target domains using a few unlabeled images. Emerging research has shown that the label and domain information is separately embedded in the weight matrix and batch normalization (BN) layer. Previous works normally update the whole network naively without explicitly decoupling the knowledge between label and domain. As a result, it leads to knowledge interference and defective distribution adaptation. In this work, we propose to reduce such learning interference and elevate the domain knowledge learning by only manipulating the BN layer. However, the normalization step in BN is intrinsically unstable when the statistics are re-estimated from a few samples. We find that ambiguities can be greatly reduced when only updating the two affine parameters in BN while keeping the source domain statistics. To further enhance the domain knowledge extraction from unlabeled data, we construct an auxiliary branch with label-independent self-supervised learning (SSL) to provide supervision. Moreover, we propose a bi-level optimization based on meta-learning to enforce the alignment of two learning objectives of auxiliary and main branches. The goal is to use the auxiliary branch to adapt the domain and benefit main task for subsequent inference. Our method keeps the same computational cost at inference as the auxiliary branch can be thoroughly discarded after adaptation. Extensive experiments show that our method outperforms the prior works on five WILDS real-world domain shift datasets. Our method can also be integrated with methods with label-dependent optimization to further push the performance boundary. Our code is available at https://github.com/ynanwu/MABN.

Test-Time Domain Adaptation by Learning Domain-Aware Batch Normalization

TL;DR

Abstract

Paper Structure (27 sections, 8 equations, 6 figures, 7 tables, 1 algorithm)

This paper contains 27 sections, 8 equations, 6 figures, 7 tables, 1 algorithm.

Introduction
Related Work
Domain shift.
Batch normalization.
Meta-learning.
The Proposed Method
Problem setting.
Motivations
Learning label-dependent representation
Learning to adapt to unseen domain knowledge
Meta-auxiliary training.
Meta-auxiliary testing.
Experiments
Dataset and evaluation metrics.
Model architectures.
...and 12 more sections

Figures (6)

Figure 1: Top: Illustration of TT-DA setting. Given an unseen target domain at test-time, a few unlabeled data are used for adaptation and the adapted model is then used for inference. Bottom: To maximize the extraction of domain knowledge from unlabeled data, we only update the two affine parameters in BN using the label-independent self-supervised loss. The goal is to enforce the adapted affine parameters to correct the feature distribution towards target domain.
Figure 2: Overview of the proposed MABN. In the joint training stage (a), we train the entire network to learn both label knowledge and normalization statistics by mixing all the source data and performing joint training. During the meta-auxiliary training stage (b), we first obtain the adapted parameters based on the auxiliary loss in the inner loop. Then, the meta-model is updated at the outer loop based on the main task loss computed on adapted parameters. At test-time (c), we simply apply the adaptation step to update the model specifically to an unseen target domain.
Figure 3: t-SNE visualization of features before and after adaptation. Each data sample is represented as a point, and each color corresponds to a class randomly selected from the target domain of the iWildCam dataset.
Figure 4: Illustration of the intermediate features before and after BN for each adaptation method. We randomly select a layer and a channel for illustration. The first column shows the distribution of the feature before BN layer and the second column shows the feature after BN.
Figure 5: Adaptive evaluation on various number of unlabeled samples without meta-auxiliary training.
...and 1 more figures

Test-Time Domain Adaptation by Learning Domain-Aware Batch Normalization

TL;DR

Abstract

Test-Time Domain Adaptation by Learning Domain-Aware Batch Normalization

Authors

TL;DR

Abstract

Table of Contents

Figures (6)