Table of Contents
Fetching ...

Domain-Specific Block Selection and Paired-View Pseudo-Labeling for Online Test-Time Adaptation

Yeonguk Yu, Sungho Shin, Seunghyeok Back, Minhwan Ko, Sangjun Noh, Kyoobin Lee

TL;DR

This work tackles the challenge of online test-time adaptation under continual domain shift by proposing DPLOT, a two-fold framework combining domain-specific block selection with paired-view pseudo-labeling. Before deployment, DPLOT identifies blocks responsible for domain-specific features via prototype-based similarity after entropy minimization, selecting those with high similarity above a threshold to update only those blocks during adaptation. After deployment, a mean-teacher guides updates using entropy minimization on the selected blocks and a paired-view consistency loss that averages predictions from the test image and its horizontally flipped counterpart, enabling robust long-term adaptation. Empirically, DPLOT achieves state-of-the-art performance on CIFAR10-C, CIFAR100-C, and ImageNet-C in both continual and gradual settings, with substantial reductions in error rates and strong ablations confirming the importance of each component. The method preserves domain-invariant features while adapting domain-specific components, offering a practical, source-free approach for real-world non-stationary environments; code is released for reproducibility.

Abstract

Test-time adaptation (TTA) aims to adapt a pre-trained model to a new test domain without access to source data after deployment. Existing approaches typically rely on self-training with pseudo-labels since ground-truth cannot be obtained from test data. Although the quality of pseudo labels is important for stable and accurate long-term adaptation, it has not been previously addressed. In this work, we propose DPLOT, a simple yet effective TTA framework that consists of two components: (1) domain-specific block selection and (2) pseudo-label generation using paired-view images. Specifically, we select blocks that involve domain-specific feature extraction and train these blocks by entropy minimization. After blocks are adjusted for current test domain, we generate pseudo-labels by averaging given test images and corresponding flipped counterparts. By simply using flip augmentation, we prevent a decrease in the quality of the pseudo-labels, which can be caused by the domain gap resulting from strong augmentation. Our experimental results demonstrate that DPLOT outperforms previous TTA methods in CIFAR10-C, CIFAR100-C, and ImageNet-C benchmarks, reducing error by up to 5.4%, 9.1%, and 2.9%, respectively. Also, we provide an extensive analysis to demonstrate effectiveness of our framework. Code is available at https://github.com/gist-ailab/domain-specific-block-selection-and-paired-view-pseudo-labeling-for-online-TTA.

Domain-Specific Block Selection and Paired-View Pseudo-Labeling for Online Test-Time Adaptation

TL;DR

This work tackles the challenge of online test-time adaptation under continual domain shift by proposing DPLOT, a two-fold framework combining domain-specific block selection with paired-view pseudo-labeling. Before deployment, DPLOT identifies blocks responsible for domain-specific features via prototype-based similarity after entropy minimization, selecting those with high similarity above a threshold to update only those blocks during adaptation. After deployment, a mean-teacher guides updates using entropy minimization on the selected blocks and a paired-view consistency loss that averages predictions from the test image and its horizontally flipped counterpart, enabling robust long-term adaptation. Empirically, DPLOT achieves state-of-the-art performance on CIFAR10-C, CIFAR100-C, and ImageNet-C in both continual and gradual settings, with substantial reductions in error rates and strong ablations confirming the importance of each component. The method preserves domain-invariant features while adapting domain-specific components, offering a practical, source-free approach for real-world non-stationary environments; code is released for reproducibility.

Abstract

Test-time adaptation (TTA) aims to adapt a pre-trained model to a new test domain without access to source data after deployment. Existing approaches typically rely on self-training with pseudo-labels since ground-truth cannot be obtained from test data. Although the quality of pseudo labels is important for stable and accurate long-term adaptation, it has not been previously addressed. In this work, we propose DPLOT, a simple yet effective TTA framework that consists of two components: (1) domain-specific block selection and (2) pseudo-label generation using paired-view images. Specifically, we select blocks that involve domain-specific feature extraction and train these blocks by entropy minimization. After blocks are adjusted for current test domain, we generate pseudo-labels by averaging given test images and corresponding flipped counterparts. By simply using flip augmentation, we prevent a decrease in the quality of the pseudo-labels, which can be caused by the domain gap resulting from strong augmentation. Our experimental results demonstrate that DPLOT outperforms previous TTA methods in CIFAR10-C, CIFAR100-C, and ImageNet-C benchmarks, reducing error by up to 5.4%, 9.1%, and 2.9%, respectively. Also, we provide an extensive analysis to demonstrate effectiveness of our framework. Code is available at https://github.com/gist-ailab/domain-specific-block-selection-and-paired-view-pseudo-labeling-for-online-TTA.
Paper Structure (22 sections, 7 equations, 4 figures, 6 tables, 2 algorithms)

This paper contains 22 sections, 7 equations, 4 figures, 6 tables, 2 algorithms.

Figures (4)

  • Figure 1: Results of the proposed framework for online test-time adaptation (Orange). We evaluate average error rates of the WideResNet40 and ResNext-29 architectures for the CIFAR100-C gradual setting benchmark using competitive test-time adaptation methods. In the gradual setting, the networks should adapt to continually changing corruption domains (135 changes in total).
  • Figure 2: Illustration of our proposed test-time adaptation using entropy minimization and paired-view consistency. During test-time, the current test and corresponding flipped images are given to the student model and EMA teacher model. Entropy minimization is performed to update the parameters of selected blocks (yellow arrow), while all parameters are updated to minimize the difference between student output and the averaged EMA model's prediction (blue arrow).
  • Figure 3: Illustrations of our proposed block selection results (a, b) and classification error rate (@level 1-5) for the gradual setting benchmark using other entropy minimization-based methods with or without block selection (c, d). Additionally, in (a) and (b), the source accuracy after the long-time adaptation (i.e., gradual setting benchmark) with selected blocks is shown in a bar graph.
  • Figure 4: Performance of our framework with various pseudo-label generation setting including our paired-view (orange) and others. We use ResNext-29A for CIFAR100-C.