GDA: Generalized Diffusion for Robust Test-time Adaptation
Yun-Yun Tsai, Fu-Chen Chen, Albert Y. C. Chen, Junfeng Yang, Che-Chun Su, Min Sun, Cheng-Hao Kuo
TL;DR
This work tackles robustness to unseen distribution shifts by proposing Generalized Diffusion Adaptation (GDA), a diffusion-based test-time approach that does not modify model weights. GDA guides the reverse diffusion process with structural losses—marginal entropy, style transfer via CLIP, and content preservation via patch-wise contrastive signals—to pull OOD samples back toward the source domain. Using an efficient DDIM-style sampling strategy, GDA improves accuracy across ImageNet-C, Rendition, Sketch, and Stylized-ImageNet for multiple backbones, with notable gains (up to about 5 percentage points on ImageNet-C and 2.5–7.4 points on stylized-style benchmarks) and reduced adaptation cost. These results demonstrate that diffusion-based test-time adaptation with targeted guidance can generalize across diverse OOD types, offering a practical route to robust deployment without retraining models. The findings suggest promising extensions to other vision tasks and broader guidance mechanisms for diffusion in OOD settings.
Abstract
Machine learning models struggle with generalization when encountering out-of-distribution (OOD) samples with unexpected distribution shifts. For vision tasks, recent studies have shown that test-time adaptation employing diffusion models can achieve state-of-the-art accuracy improvements on OOD samples by generating new samples that align with the model's domain without the need to modify the model's weights. Unfortunately, those studies have primarily focused on pixel-level corruptions, thereby lacking the generalization to adapt to a broader range of OOD types. We introduce Generalized Diffusion Adaptation (GDA), a novel diffusion-based test-time adaptation method robust against diverse OOD types. Specifically, GDA iteratively guides the diffusion by applying a marginal entropy loss derived from the model, in conjunction with style and content preservation losses during the reverse sampling process. In other words, GDA considers the model's output behavior with the semantic information of the samples as a whole, which can reduce ambiguity in downstream tasks during the generation process. Evaluation across various popular model architectures and OOD benchmarks shows that GDA consistently outperforms prior work on diffusion-driven adaptation. Notably, it achieves the highest classification accuracy improvements, ranging from 4.4\% to 5.02\% on ImageNet-C and 2.5\% to 7.4\% on Rendition, Sketch, and Stylized benchmarks. This performance highlights GDA's generalization to a broader range of OOD benchmarks.
