Table of Contents
Fetching ...

Historical Test-time Prompt Tuning for Vision Foundation Models

Jingyi Zhang, Jiaxing Huang, Xiaoqin Zhang, Ling Shao, Shijian Lu

TL;DR

HisTPT is proposed, a Historical Test-time Prompt Tuning technique that memorizes the useful knowledge of the learnt test samples and enables robust test-time prompt tuning with the memorized knowledge.

Abstract

Test-time prompt tuning, which learns prompts online with unlabelled test samples during the inference stage, has demonstrated great potential by learning effective prompts on-the-fly without requiring any task-specific annotations. However, its performance often degrades clearly along the tuning process when the prompts are continuously updated with the test data flow, and the degradation becomes more severe when the domain of test samples changes continuously. We propose HisTPT, a Historical Test-time Prompt Tuning technique that memorizes the useful knowledge of the learnt test samples and enables robust test-time prompt tuning with the memorized knowledge. HisTPT introduces three types of knowledge banks, namely, local knowledge bank, hard-sample knowledge bank, and global knowledge bank, each of which works with different mechanisms for effective knowledge memorization and test-time prompt optimization. In addition, HisTPT features an adaptive knowledge retrieval mechanism that regularizes the prediction of each test sample by adaptively retrieving the memorized knowledge. Extensive experiments show that HisTPT achieves superior prompt tuning performance consistently while handling different visual recognition tasks (e.g., image classification, semantic segmentation, and object detection) and test samples from continuously changing domains.

Historical Test-time Prompt Tuning for Vision Foundation Models

TL;DR

HisTPT is proposed, a Historical Test-time Prompt Tuning technique that memorizes the useful knowledge of the learnt test samples and enables robust test-time prompt tuning with the memorized knowledge.

Abstract

Test-time prompt tuning, which learns prompts online with unlabelled test samples during the inference stage, has demonstrated great potential by learning effective prompts on-the-fly without requiring any task-specific annotations. However, its performance often degrades clearly along the tuning process when the prompts are continuously updated with the test data flow, and the degradation becomes more severe when the domain of test samples changes continuously. We propose HisTPT, a Historical Test-time Prompt Tuning technique that memorizes the useful knowledge of the learnt test samples and enables robust test-time prompt tuning with the memorized knowledge. HisTPT introduces three types of knowledge banks, namely, local knowledge bank, hard-sample knowledge bank, and global knowledge bank, each of which works with different mechanisms for effective knowledge memorization and test-time prompt optimization. In addition, HisTPT features an adaptive knowledge retrieval mechanism that regularizes the prediction of each test sample by adaptively retrieving the memorized knowledge. Extensive experiments show that HisTPT achieves superior prompt tuning performance consistently while handling different visual recognition tasks (e.g., image classification, semantic segmentation, and object detection) and test samples from continuously changing domains.

Paper Structure

This paper contains 12 sections, 8 equations, 3 figures, 7 tables.

Figures (3)

  • Figure 1: (a) Test-time Prompt Tuning learns and optimizes prompts from a continuous flow of unlabelled test samples during the inference stage. (b) Most existing test-time prompt tuning methods such as TPT shutest and DiffTPT feng2023diverse tend to 'forget' historical knowledge learnt from previous test samples when the prompts are continuously updated with the test data flow. They learn effective prompts at early tuning stage, but the learnt prompts degrade gradually along the tuning process. This phenomenon becomes more apparent when the domain of test samples changes continuously. The curves are derived from 100 runs over 3 different domains cordts2016cityscapessakaridis2021acdc. In each run, the order of the 3 domains as well as the samples within each domain is randomly shuffled to simulate continuously changing test domains.
  • Figure 2: Overview of the proposed HisTPT. HisTPT features three types of knowledge banks, namely, local knowledge bank, hard-sample knowledge bank, and global knowledge bank, which learn and memorize up-to-date, difficult and representative knowledge, respectively, from previous test samples (e.g., $x_{n-2}$ and $x_{n-1}$) and their learnt text tokens (e.g., $\mathbf{t}_{n-2}$ and $\mathbf{t}_{n-1}$) along the test-time prompt tuning process. For the current test sample $x_n$, HisTPT regularizes its prediction by retrieving the memorized knowledge via an adaptive knowledge retrieval mechanism, enabling prompt optimization for $x_n$ with the self-supervised loss $\mathcal{L}_{self}$.
  • Figure 3: HisTPT with multiple optimization steps.