Table of Contents
Fetching ...

Energy-based Preference Optimization for Test-time Adaptation

Yewon Han, Seoyun Yang, Taesup Kim

TL;DR

This paper tackles test-time adaptation under distribution shifts with unlabeled target data. It introduces EpoTTA, a sampling-free energy-based framework that reformulates the target marginal as $p_{\theta}(x)=(1/Z)q_{\phi}(x)\exp(-\tilde{E}_{\theta}(x)/\beta)$ and derives a direct preference-optimization objective in the spirit of Direct Preference Optimization to avoid normalization constants and SGLD. Empirically, EpoTTA achieves strong accuracy and calibration on CIFAR10-C, CIFAR100-C, and TinyImageNet-C, while substantially reducing computational cost compared with SGLD-based energy methods; it also demonstrates robustness under non-IID test streams and with small replay buffers. The approach offers a practical, scalable alternative for real-world TTA, providing reliable adaptation without reliance on uncertain pseudo-labels or expensive sampling, and it substantially narrows the gap between performance and deployment constraints in dynamic environments.

Abstract

Test-Time Adaptation (TTA) enhances model robustness by enabling adaptation to target distributions that differ from training distributions, improving real-world generalizability. Existing TTA approaches focus on adjusting the conditional distribution; however these methods often depend on uncertain predictions in the absence of label information, leading to unreliable performance. Energy-based frameworks suggest a promising alternative to address distribution shifts without relying on uncertain predictions, instead computing the marginal distribution of target data. However, they involve the critical challenge of requiring extensive SGLD sampling, which is impractical for test-time scenarios requiring immediate adaptation. In this work, we propose Energy-based Preference Optimization for Test-time Adaptation (EPOTTA), which is based on a sampling free strategy. We first parameterize the target model using a pretrained model and residual energy function, enabling marginal likelihood maximization of target data without sampling. Building on the observation that the parameterization is mathematically equivalent to DPO objective, we then directly adapt the model to a target distribution without explicitly training the residual. Our experiments verify that EPOTTA is well-calibrated and performant while achieving computational efficiency.

Energy-based Preference Optimization for Test-time Adaptation

TL;DR

This paper tackles test-time adaptation under distribution shifts with unlabeled target data. It introduces EpoTTA, a sampling-free energy-based framework that reformulates the target marginal as and derives a direct preference-optimization objective in the spirit of Direct Preference Optimization to avoid normalization constants and SGLD. Empirically, EpoTTA achieves strong accuracy and calibration on CIFAR10-C, CIFAR100-C, and TinyImageNet-C, while substantially reducing computational cost compared with SGLD-based energy methods; it also demonstrates robustness under non-IID test streams and with small replay buffers. The approach offers a practical, scalable alternative for real-world TTA, providing reliable adaptation without reliance on uncertain pseudo-labels or expensive sampling, and it substantially narrows the gap between performance and deployment constraints in dynamic environments.

Abstract

Test-Time Adaptation (TTA) enhances model robustness by enabling adaptation to target distributions that differ from training distributions, improving real-world generalizability. Existing TTA approaches focus on adjusting the conditional distribution; however these methods often depend on uncertain predictions in the absence of label information, leading to unreliable performance. Energy-based frameworks suggest a promising alternative to address distribution shifts without relying on uncertain predictions, instead computing the marginal distribution of target data. However, they involve the critical challenge of requiring extensive SGLD sampling, which is impractical for test-time scenarios requiring immediate adaptation. In this work, we propose Energy-based Preference Optimization for Test-time Adaptation (EPOTTA), which is based on a sampling free strategy. We first parameterize the target model using a pretrained model and residual energy function, enabling marginal likelihood maximization of target data without sampling. Building on the observation that the parameterization is mathematically equivalent to DPO objective, we then directly adapt the model to a target distribution without explicitly training the residual. Our experiments verify that EPOTTA is well-calibrated and performant while achieving computational efficiency.

Paper Structure

This paper contains 39 sections, 16 equations, 12 figures, 11 tables.

Figures (12)

  • Figure 1: The overconfidence problem of entropy minimization in test-time adaptation on CIFAR10-C. TENT tends to increase a model's confidence in uncertain predictions as adaptation progresses, often leading to worse calibration due to overconfidence. In contrast, EpoTTA (Ours) stabilizes the adaptation process by gradually reducing the expected calibration error.
  • Figure 2: Overview of our proposed method. Test-time adaptation is implemented using a direct optimization framework, which prefers target samples over the source samples used for pretraining the source model.
  • Figure 3: Comparison of GFLOPs, ECE and Acc against competitive baselines on TinyImageNet-C at the average across severity levels 1-5.
  • Figure 4: Sample-wise relationship between energy and gradient coefficient $w(x_t,x_s)$ in \ref{['eqn:objgrad']} on CIFAR10-C at severity 5.
  • Figure 5: Comparison of Expected Calibration Error (ECE$\downarrow$) on the CIFAR10-C dataset across different corruption severity levels.
  • ...and 7 more figures