Table of Contents
Fetching ...

COME: Test-time adaption by Conservatively Minimizing Entropy

Qingyang Zhang, Yatao Bian, Xinke Kong, Peilin Zhao, Changqing Zhang

TL;DR

The Conservatively Minimize the Entropy (COME) is proposed, which is a simple drop-in replacement of traditional EM to elegantly address the limitation of overconfidence in test-time adaption methods.

Abstract

Machine learning models must continuously self-adjust themselves for novel data distribution in the open world. As the predominant principle, entropy minimization (EM) has been proven to be a simple yet effective cornerstone in existing test-time adaption (TTA) methods. While unfortunately its fatal limitation (i.e., overconfidence) tends to result in model collapse. For this issue, we propose to Conservatively Minimize the Entropy (COME), which is a simple drop-in replacement of traditional EM to elegantly address the limitation. In essence, COME explicitly models the uncertainty by characterizing a Dirichlet prior distribution over model predictions during TTA. By doing so, COME naturally regularizes the model to favor conservative confidence on unreliable samples. Theoretically, we provide a preliminary analysis to reveal the ability of COME in enhancing the optimization stability by introducing a data-adaptive lower bound on the entropy. Empirically, our method achieves state-of-the-art performance on commonly used benchmarks, showing significant improvements in terms of classification accuracy and uncertainty estimation under various settings including standard, life-long and open-world TTA, i.e., up to $34.5\%$ improvement on accuracy and $15.1\%$ on false positive rate.

COME: Test-time adaption by Conservatively Minimizing Entropy

TL;DR

The Conservatively Minimize the Entropy (COME) is proposed, which is a simple drop-in replacement of traditional EM to elegantly address the limitation of overconfidence in test-time adaption methods.

Abstract

Machine learning models must continuously self-adjust themselves for novel data distribution in the open world. As the predominant principle, entropy minimization (EM) has been proven to be a simple yet effective cornerstone in existing test-time adaption (TTA) methods. While unfortunately its fatal limitation (i.e., overconfidence) tends to result in model collapse. For this issue, we propose to Conservatively Minimize the Entropy (COME), which is a simple drop-in replacement of traditional EM to elegantly address the limitation. In essence, COME explicitly models the uncertainty by characterizing a Dirichlet prior distribution over model predictions during TTA. By doing so, COME naturally regularizes the model to favor conservative confidence on unreliable samples. Theoretically, we provide a preliminary analysis to reveal the ability of COME in enhancing the optimization stability by introducing a data-adaptive lower bound on the entropy. Empirically, our method achieves state-of-the-art performance on commonly used benchmarks, showing significant improvements in terms of classification accuracy and uncertainty estimation under various settings including standard, life-long and open-world TTA, i.e., up to improvement on accuracy and on false positive rate.

Paper Structure

This paper contains 27 sections, 3 theorems, 26 equations, 17 figures, 12 tables, 1 algorithm.

Key Result

Lemma 1

For any $x\in\mathcal{X}$, we have where $f(x)$ is the model output logits, $K$ is the total class number and $||\cdot||_p$ denotes the $p$-norm.

Figures (17)

  • Figure 1: Empirical observations of Entropy Minimization when equipped to Tent wang2020tent. Along the TTA process, the uncertainty of models tuned with EM quickly drops, and the false positive rate decreases temporarily for a very short time horizon before quickly increasing. Along the same adaption trajectory, the model accuracy also improves for a short time compared to the initial model and then quickly decreases, after which the model collapses to a trivial solution. We manually tune an entropy threshold to filter out a proportion of (100-m)% unreliable samples with highest entropy and only conduct entropy minimization on the rest m% low-entropy samples. However, the resultant methods still suffer from aforementioned issues. Therefore, we believe that the entropy minimization learning principle is inherently problematic in TTA, which necessitates a more principled solution.
  • Figure 2: Comparison on two representative TTA methods, i.e., the seminal Tent wang2020tent and recent SOTA SAR niu2023towards. By contrast to EM, our COME establishes a stable TTA process with consistently improved classification accuracy and false positive rate. Besides, the model confidence of our COME is much more conservative, which implies fewer risks of overconfidence and a more accurate uncertainty awareness.
  • Figure 3: Distribution of model confidence.
  • Figure 4: Comparison on two representative TTA methods on ImageNet-C under Gaussian Noise corruption of severity level 5. By contrast to EM, our COME establishes a stable TTA process with consistently improved classification accuracy and false positive rate.
  • Figure 5: Comparison on two representative TTA methods on ImageNet-C under Shot Noise corruption of severity level 5. By contrast to EM, our COME establishes a stable TTA process with consistently improved classification accuracy and false positive rate.
  • ...and 12 more figures

Theorems & Definitions (6)

  • Lemma 1
  • Theorem 1: Model confidence upper bound
  • Lemma 2
  • proof
  • proof
  • proof