Table of Contents
Fetching ...

Beyond Model Adaptation at Test Time: A Survey

Zehao Xiao, Cees G. M. Snoek

TL;DR

This survey provides a comprehensive and systematic review on test-time adaptation, covering more than 400 recent papers, and categorizes existing methods into five distinct categories based on what component of the method is adjusted for test-time adaptation: the model, the inference, the normalization, the sample, or the prompt.

Abstract

Machine learning algorithms have achieved remarkable success across various disciplines, use cases and applications, under the prevailing assumption that training and test samples are drawn from the same distribution. Consequently, these algorithms struggle and become brittle even when samples in the test distribution start to deviate from the ones observed during training. Domain adaptation and domain generalization have been studied extensively as approaches to address distribution shifts across test and train domains, but each has its limitations. Test-time adaptation, a recently emerging learning paradigm, combines the benefits of domain adaptation and domain generalization by training models only on source data and adapting them to target data during test-time inference. In this survey, we provide a comprehensive and systematic review on test-time adaptation, covering more than 400 recent papers. We structure our review by categorizing existing methods into five distinct categories based on what component of the method is adjusted for test-time adaptation: the model, the inference, the normalization, the sample, or the prompt, providing detailed analysis of each. We further discuss the various preparation and adaptation settings for methods within these categories, offering deeper insights into the effective deployment for the evaluation of distribution shifts and their real-world application in understanding images, video and 3D, as well as modalities beyond vision. We close the survey with an outlook on emerging research opportunities for test-time adaptation.

Beyond Model Adaptation at Test Time: A Survey

TL;DR

This survey provides a comprehensive and systematic review on test-time adaptation, covering more than 400 recent papers, and categorizes existing methods into five distinct categories based on what component of the method is adjusted for test-time adaptation: the model, the inference, the normalization, the sample, or the prompt.

Abstract

Machine learning algorithms have achieved remarkable success across various disciplines, use cases and applications, under the prevailing assumption that training and test samples are drawn from the same distribution. Consequently, these algorithms struggle and become brittle even when samples in the test distribution start to deviate from the ones observed during training. Domain adaptation and domain generalization have been studied extensively as approaches to address distribution shifts across test and train domains, but each has its limitations. Test-time adaptation, a recently emerging learning paradigm, combines the benefits of domain adaptation and domain generalization by training models only on source data and adapting them to target data during test-time inference. In this survey, we provide a comprehensive and systematic review on test-time adaptation, covering more than 400 recent papers. We structure our review by categorizing existing methods into five distinct categories based on what component of the method is adjusted for test-time adaptation: the model, the inference, the normalization, the sample, or the prompt, providing detailed analysis of each. We further discuss the various preparation and adaptation settings for methods within these categories, offering deeper insights into the effective deployment for the evaluation of distribution shifts and their real-world application in understanding images, video and 3D, as well as modalities beyond vision. We close the survey with an outlook on emerging research opportunities for test-time adaptation.

Paper Structure

This paper contains 24 sections, 12 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: Five-year summary of test-time adaptation research works. Statistics collected from the seven highest-ranked AI conferences in Google Scholar: CVPR, NeurIPS, ICLR, ICML, ICCV, ECCV, and AAAI. This paper provides a survey on the steadily increasing number of research works.
  • Figure 2: Learning frameworks that attack distribution shifts. (a) Domain adaptation addresses domain shifts by accessing both source and target data during training. (b) Domain generalization avoids the need for target samples during training, but lacks information on the target distribution during inference. (c) Source-free adaptation considers both by introducing an intermediate adaptation stage after source training and before inference. (d) Test-time adaptation achieves adaptation on target data along with inference. Test-time adaptation is the focus of this survey, it aims to adjust a source-trained model to target data without having any prior knowledge of the target data before the testing phase.
  • Figure 3: Model adaptation. These methods gradually update their source-trained model by backpropagating a self-trained loss on target test data.
  • Figure 4: Inference adaptation. These methods generate model parameters by an auxiliary model-inference module in a single-forward pass, without any back-propagation.
  • Figure 5: Normalization adaptation. These methods adjust their normalization statistics while fixing source parameters.
  • ...and 3 more figures