Table of Contents
Fetching ...

POEM: Explore Unexplored Reliable Samples to Enhance Test-Time Adaptation

Chang'an Yi, Xiaohui Deng, Shuaicheng Niu, Yan Zhou

TL;DR

POEM addresses the limitation of entropy-threshold based test-time adaptation by mining potentially reliable samples that can become reliable after model updates, providing stable supervisory signals and well-behaved gradients. It introduces an Adapt Branch network that cooperates with a frozen Source Branch to balance learning target-specific information with preserving domain-agnostic knowledge, updating only shallow normalization layers and the Adapt Branch. Empirical results on ImageNet-C, CIFAR100-C, and real-world domain shifts show POEM consistently outperforms state-of-the-art entropy-based TTA methods and can augment existing approaches with negligible overhead, while remaining robust to threshold choices and suitable for real-time deployment. These findings suggest POEM as a versatile augmentation to enhance TTA across diverse, challenging domain shifts.

Abstract

Test-time adaptation (TTA) aims to transfer knowledge from a source model to unknown test data with potential distribution shifts in an online manner. Many existing TTA methods rely on entropy as a confidence metric to optimize the model. However, these approaches are sensitive to the predefined entropy threshold, influencing which samples are chosen for model adaptation. Consequently, potentially reliable target samples are often overlooked and underutilized. For instance, a sample's entropy might slightly exceed the threshold initially, but fall below it after the model is updated. Such samples can provide stable supervised information and offer a normal range of gradients to guide model adaptation. In this paper, we propose a general approach, \underline{POEM}, to promote TTA via ex\underline{\textbf{p}}loring the previously unexpl\underline{\textbf{o}}red reliabl\underline{\textbf{e}} sa\underline{\textbf{m}}ples. Additionally, we introduce an extra Adapt Branch network to strike a balance between extracting domain-agnostic representations and achieving high performance on target data. Comprehensive experiments across multiple architectures demonstrate that POEM consistently outperforms existing TTA methods in both challenging scenarios and real-world domain shifts, while remaining computationally efficient. The effectiveness of POEM is evaluated through extensive analyses and thorough ablation studies. Moreover, the core idea behind POEM can be employed as an augmentation strategy to boost the performance of existing TTA approaches. The source code is publicly available at \emph{https://github.com/ycarobot/POEM}

POEM: Explore Unexplored Reliable Samples to Enhance Test-Time Adaptation

TL;DR

POEM addresses the limitation of entropy-threshold based test-time adaptation by mining potentially reliable samples that can become reliable after model updates, providing stable supervisory signals and well-behaved gradients. It introduces an Adapt Branch network that cooperates with a frozen Source Branch to balance learning target-specific information with preserving domain-agnostic knowledge, updating only shallow normalization layers and the Adapt Branch. Empirical results on ImageNet-C, CIFAR100-C, and real-world domain shifts show POEM consistently outperforms state-of-the-art entropy-based TTA methods and can augment existing approaches with negligible overhead, while remaining robust to threshold choices and suitable for real-time deployment. These findings suggest POEM as a versatile augmentation to enhance TTA across diverse, challenging domain shifts.

Abstract

Test-time adaptation (TTA) aims to transfer knowledge from a source model to unknown test data with potential distribution shifts in an online manner. Many existing TTA methods rely on entropy as a confidence metric to optimize the model. However, these approaches are sensitive to the predefined entropy threshold, influencing which samples are chosen for model adaptation. Consequently, potentially reliable target samples are often overlooked and underutilized. For instance, a sample's entropy might slightly exceed the threshold initially, but fall below it after the model is updated. Such samples can provide stable supervised information and offer a normal range of gradients to guide model adaptation. In this paper, we propose a general approach, \underline{POEM}, to promote TTA via ex\underline{\textbf{p}}loring the previously unexpl\underline{\textbf{o}}red reliabl\underline{\textbf{e}} sa\underline{\textbf{m}}ples. Additionally, we introduce an extra Adapt Branch network to strike a balance between extracting domain-agnostic representations and achieving high performance on target data. Comprehensive experiments across multiple architectures demonstrate that POEM consistently outperforms existing TTA methods in both challenging scenarios and real-world domain shifts, while remaining computationally efficient. The effectiveness of POEM is evaluated through extensive analyses and thorough ablation studies. Moreover, the core idea behind POEM can be employed as an augmentation strategy to boost the performance of existing TTA approaches. The source code is publicly available at \emph{https://github.com/ycarobot/POEM}

Paper Structure

This paper contains 27 sections, 6 equations, 6 figures, 12 tables, 1 algorithm.

Figures (6)

  • Figure 1: An illustration showing the motivation of this work. (a) Two samples (No. 4, No. 7) are initially predicted as high-entropy but are confirmed as reliable (low-entropy) after TTA. (b) While the number of potentially reliable samples is small, their pseudo-labels exhibit the highest confidence. (c) POEM outperforms existing approaches and its general ideal can also be integrated into these approaches to enhance their performance. The experiments are conducted on the ImageNet-C dataset, which contains 50,000 samples.
  • Figure 2: The overall framework of the proposed POEM approach. The selected reliable samples are utilized to update the normalization parameters of both the Shallow Layers and the Adapt Branch network. The selection and update process iterates over some rounds, a hyper-parameter set beforehand.
  • Figure 3: Samples in the green zone provide a normal range of gradients and high-quality supervised information for TTA on the dataset ImageNet-C. Therefore, it is necessary to utilize potentially reliable samples.
  • Figure 4: An illustration depicting the relationship between entropy range and the identification of potentially reliable samples. It is evident that such samples cannot be determined solely based on the entropy range.
  • Figure 5: Sensitivity to the predefined entropy threshold (ImageNet-C). POEM demonstrates robustness to variations in the threshold and consistently outperforms SOTA approaches.
  • ...and 1 more figures