Table of Contents
Fetching ...

Distribution Alignment for Fully Test-Time Adaptation with Dynamic Online Data Streams

Ziqiang Wang, Zhixiang Chi, Yanan Wu, Li Gu, Zhi Liu, Konstantinos Plataniotis, Yang Wang

TL;DR

This work tackles the challenge of fully Test-Time Adaptation under dynamic, non-i.i.d. data streams by reversing the conventional adaptation objective: instead of adapting to incoming test distributions, it aligns test-time feature distributions toward the source distribution using a Distribution Alignment (DA) loss applied to BN affine parameters. A domain-shift detection mechanism monitors the DA loss to reset affine and BN statistics in continual TTA, enabling robust adaptation across sequentially changing domains. The authors demonstrate that DA-TTA achieves state-of-the-art performance on non-i.i.d. streams across six datasets, with notable accuracy gains over strong baselines, while preserving competitive results under i.i.d. conditions. This approach offers practical benefits for real-world systems facing continual domain shifts, by maintaining compatibility with a well-trained source model and reducing conflicting batch objectives during online adaptation.

Abstract

Given a model trained on source data, Test-Time Adaptation (TTA) enables adaptation and inference in test data streams with domain shifts from the source. Current methods predominantly optimize the model for each incoming test data batch using self-training loss. While these methods yield commendable results in ideal test data streams, where batches are independently and identically sampled from the target distribution, they falter under more practical test data streams that are not independent and identically distributed (non-i.i.d.). The data batches in a non-i.i.d. stream display prominent label shifts relative to each other. It leads to conflicting optimization objectives among batches during the TTA process. Given the inherent risks of adapting the source model to unpredictable test-time distributions, we reverse the adaptation process and propose a novel Distribution Alignment loss for TTA. This loss guides the distributions of test-time features back towards the source distributions, which ensures compatibility with the well-trained source model and eliminates the pitfalls associated with conflicting optimization objectives. Moreover, we devise a domain shift detection mechanism to extend the success of our proposed TTA method in the continual domain shift scenarios. Our extensive experiments validate the logic and efficacy of our method. On six benchmark datasets, we surpass existing methods in non-i.i.d. scenarios and maintain competitive performance under the ideal i.i.d. assumption.

Distribution Alignment for Fully Test-Time Adaptation with Dynamic Online Data Streams

TL;DR

This work tackles the challenge of fully Test-Time Adaptation under dynamic, non-i.i.d. data streams by reversing the conventional adaptation objective: instead of adapting to incoming test distributions, it aligns test-time feature distributions toward the source distribution using a Distribution Alignment (DA) loss applied to BN affine parameters. A domain-shift detection mechanism monitors the DA loss to reset affine and BN statistics in continual TTA, enabling robust adaptation across sequentially changing domains. The authors demonstrate that DA-TTA achieves state-of-the-art performance on non-i.i.d. streams across six datasets, with notable accuracy gains over strong baselines, while preserving competitive results under i.i.d. conditions. This approach offers practical benefits for real-world systems facing continual domain shifts, by maintaining compatibility with a well-trained source model and reducing conflicting batch objectives during online adaptation.

Abstract

Given a model trained on source data, Test-Time Adaptation (TTA) enables adaptation and inference in test data streams with domain shifts from the source. Current methods predominantly optimize the model for each incoming test data batch using self-training loss. While these methods yield commendable results in ideal test data streams, where batches are independently and identically sampled from the target distribution, they falter under more practical test data streams that are not independent and identically distributed (non-i.i.d.). The data batches in a non-i.i.d. stream display prominent label shifts relative to each other. It leads to conflicting optimization objectives among batches during the TTA process. Given the inherent risks of adapting the source model to unpredictable test-time distributions, we reverse the adaptation process and propose a novel Distribution Alignment loss for TTA. This loss guides the distributions of test-time features back towards the source distributions, which ensures compatibility with the well-trained source model and eliminates the pitfalls associated with conflicting optimization objectives. Moreover, we devise a domain shift detection mechanism to extend the success of our proposed TTA method in the continual domain shift scenarios. Our extensive experiments validate the logic and efficacy of our method. On six benchmark datasets, we surpass existing methods in non-i.i.d. scenarios and maintain competitive performance under the ideal i.i.d. assumption.
Paper Structure (13 sections, 6 equations, 6 figures, 5 tables)

This paper contains 13 sections, 6 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: (a) Overview of our method for fully TTA. With each time step, there are distribution shifts (label shifts). Our proposed method aligns the distributions of test-time features with those of the source, not only mitigating the domain shift, but ensuring robust TTA in the non-i.i.d. data streams. (b) Average classification errors of TTA methods from CIFAR100 (source) to CIFAR100-C (target) under the fully TTA with i.i.d and non-i.i.d data stream settings. Lower is better. The red "Source" line indicates the source model's result without adaptation. Label shifts in the non-i.i.d. test data stream degrade TTBN and self-training based methods (SAR and RoTTA are designed for non-i.i.d. data streams).
  • Figure 2: (a) Non-i.i.d. data streams. (b) Continual TTA with non-i.i.d. data streams.
  • Figure 3: (a) For clarity, we use $\bm{\mu}, \bm{\sigma}$ to denote the batch statistics, while using $\bm{m}, \bm{d}$ to denote the mean and standard deviation of the feature distribution for each sample. (b) Visualization of the average statistics ($\overline{\bm{d}^{2}}(\mathbf{\hat{X}}^{(i)})$) of feature distributions in a BN layer of the source model through TTBN method. We feed both source and target data streams (both i.i.d. and non-i.i.d.) and record the average variance for each channel in the corresponding BN layer. For clarity, the curves are downsampled by a factor of 5.0 and smoothed. The figure shows that the distribution statistics for the i.i.d. stream are nearing those of the source data, associated with a $20\%$ error. In contrast, the statistics for the non-i.i.d. stream notably deviate from the source, associated with a $79\%$ error.
  • Figure 4: Illustration of the distribution alignment loss applied to features post-BN layers. This loss optimizes the affine parameters of BN layers to transform the feature distributions towards the pre-computed source distributions. Each circle represents the distribution of a channel in the feature following an affine layer, characterized by a mean (center) and a standard deviation (radius).
  • Figure 5: Robustness on various conditions of test data stream.
  • ...and 1 more figures