Table of Contents
Fetching ...

Ranked Entropy Minimization for Continual Test-Time Adaptation

Jisu Han, Jaemin Na, Wonjun Hwang

TL;DR

This work tackles instability in continual test-time adaptation (CTTA) by identifying entropy minimization (EM) collapse as a key problem and proposing Ranked Entropy Minimization (REM). REM introduces explicit mask chaining guided by vision-transformer attention to progressively increase input difficulty, paired with a masked-consistency loss $\mathcal{L}_{MCL}$ and an entropy-ranking loss $\mathcal{L}_{ERL}$, combined as $\mathcal{L}_{REM}=\mathcal{L}_{MCL}+\lambda\mathcal{L}_{ERL}$ to preserve prediction diversity while adapting online. Across ImageNetC, CIFAR10C, and CIFAR100C CTTA benchmarks, REM delivers state-of-the-art or competitive accuracy with the efficiency of entropy-based methods and without requiring student-teacher ensembles, as evidenced by significant gains over source models and previous CTTA methods. The approach also demonstrates improved calibration and robust performance under various domain shifts, supporting its practical applicability for real-time, resource-constrained deployment. Overall, REM offers a principled, efficient pathway to stable continual test-time adaptation by jointly regulating prediction difficulty and entropy through a single, masked transformer model.

Abstract

Test-time adaptation aims to adapt to realistic environments in an online manner by learning during test time. Entropy minimization has emerged as a principal strategy for test-time adaptation due to its efficiency and adaptability. Nevertheless, it remains underexplored in continual test-time adaptation, where stability is more important. We observe that the entropy minimization method often suffers from model collapse, where the model converges to predicting a single class for all images due to a trivial solution. We propose ranked entropy minimization to mitigate the stability problem of the entropy minimization method and extend its applicability to continuous scenarios. Our approach explicitly structures the prediction difficulty through a progressive masking strategy. Specifically, it gradually aligns the model's probability distributions across different levels of prediction difficulty while preserving the rank order of entropy. The proposed method is extensively evaluated across various benchmarks, demonstrating its effectiveness through empirical results. Our code is available at https://github.com/pilsHan/rem

Ranked Entropy Minimization for Continual Test-Time Adaptation

TL;DR

This work tackles instability in continual test-time adaptation (CTTA) by identifying entropy minimization (EM) collapse as a key problem and proposing Ranked Entropy Minimization (REM). REM introduces explicit mask chaining guided by vision-transformer attention to progressively increase input difficulty, paired with a masked-consistency loss and an entropy-ranking loss , combined as to preserve prediction diversity while adapting online. Across ImageNetC, CIFAR10C, and CIFAR100C CTTA benchmarks, REM delivers state-of-the-art or competitive accuracy with the efficiency of entropy-based methods and without requiring student-teacher ensembles, as evidenced by significant gains over source models and previous CTTA methods. The approach also demonstrates improved calibration and robust performance under various domain shifts, supporting its practical applicability for real-time, resource-constrained deployment. Overall, REM offers a principled, efficient pathway to stable continual test-time adaptation by jointly regulating prediction difficulty and entropy through a single, masked transformer model.

Abstract

Test-time adaptation aims to adapt to realistic environments in an online manner by learning during test time. Entropy minimization has emerged as a principal strategy for test-time adaptation due to its efficiency and adaptability. Nevertheless, it remains underexplored in continual test-time adaptation, where stability is more important. We observe that the entropy minimization method often suffers from model collapse, where the model converges to predicting a single class for all images due to a trivial solution. We propose ranked entropy minimization to mitigate the stability problem of the entropy minimization method and extend its applicability to continuous scenarios. Our approach explicitly structures the prediction difficulty through a progressive masking strategy. Specifically, it gradually aligns the model's probability distributions across different levels of prediction difficulty while preserving the rank order of entropy. The proposed method is extensively evaluated across various benchmarks, demonstrating its effectiveness through empirical results. Our code is available at https://github.com/pilsHan/rem

Paper Structure

This paper contains 28 sections, 5 equations, 13 figures, 18 tables.

Figures (13)

  • Figure 1: Our Intuition. We explicitly raise the prediction difficulty of the input images through the masking strategy. Based on the intuition that increased difficulty decreases prediction accuracy and increases entropy, we attempt to maintain a rank ordering of entropy while improving consistency from original to masked predictions. Our approach addresses the problem of model collapse in entropy minimization methods in a simple yet efficient way.
  • Figure 2: Observation on model collapse in the entropy minimization approach. (a) Under the CTTA scenario, the EM approach (Tent) undergoes significant performance degradation at a critical point (adaptation order $T_3$, Impulse noise). (b) This phenomenon occurs because the model learns constant representations that do not depend on input images, leading to a collapse in prediction diversity. This is evidenced by class probabilities converging to a single point when visualized in a polar coordinate system. Our proposed method (REM) mitigates model collapse and maintains prediction diversity.
  • Figure 3: Empirical study according to masking ratio. We report the changes in error and entropy as the masking ratio increases. Both entropy and error exhibit a monotone increasing trend with respect to the masking ratio, and we observe that linearity becomes more pronounced, especially in regions with lower masking ratios.
  • Figure 4: Backward transfer analysis on ImageNetC. We compare the performance of CTTA approaches on previous domains.
  • Figure 5: Adaptability analysis on ImageNetC under Gaussian noise corruption.
  • ...and 8 more figures