Table of Contents
Fetching ...

DeCoOp: Robust Prompt Tuning with Out-of-Distribution Detection

Zhi Zhou, Ming Yang, Jiang-Xin Shi, Lan-Zhe Guo, Yu-Feng Li

TL;DR

This work introduces Open-world Prompt Tuning (OPT) as a practical evaluation for prompt tuning in vision–language models, where training occurs on base classes but testing includes both base and new classes. It theoretically analyzes the base-to-new and new-class discriminability gaps and proposes DePt, a framework that integrates out-of-distribution detection with prompt tuning to preserve discriminability across spaces. Building on DePt, DeCoOp adds new-class detectors and specialized sub-classifiers to further boost base-class and new-class performance, and it demonstrates strong improvements across 11 datasets, achieving about a 2% average accuracy gain and enhanced base-to-new discriminability (AUROC) over baselines. The approach offers a principled, modular path to robust prompt tuning in open-world settings, with practical runtime considerations and potential for future integration of richer OOD signals.

Abstract

Vision-language models (VLMs), such as CLIP, have demonstrated impressive zero-shot capabilities for various downstream tasks. Their performance can be further enhanced through few-shot prompt tuning methods. However, current studies evaluate the performance of learned prompts separately on base and new classes. This evaluation lacks practicality for real-world applications since downstream tasks cannot determine whether the data belongs to base or new classes in advance. In this paper, we explore a problem setting called Open-world Prompt Tuning (OPT), which involves tuning prompts on base classes and evaluating on a combination of base and new classes. By introducing Decomposed Prompt Tuning framework (DePT), we theoretically demonstrate that OPT can be solved by incorporating out-of-distribution detection into prompt tuning, thereby enhancing the base-to-new discriminability. Based on DePT, we present a novel prompt tuning approach, namely, Decomposed Context Optimization (DeCoOp), which introduces new-class detectors and sub-classifiers to further enhance the base-class and new-class discriminability. Experimental results on 11 benchmark datasets validate the effectiveness of DePT and demonstrate that DeCoOp outperforms current state-of-the-art methods, providing a significant 2% average accuracy improvement.

DeCoOp: Robust Prompt Tuning with Out-of-Distribution Detection

TL;DR

This work introduces Open-world Prompt Tuning (OPT) as a practical evaluation for prompt tuning in vision–language models, where training occurs on base classes but testing includes both base and new classes. It theoretically analyzes the base-to-new and new-class discriminability gaps and proposes DePt, a framework that integrates out-of-distribution detection with prompt tuning to preserve discriminability across spaces. Building on DePt, DeCoOp adds new-class detectors and specialized sub-classifiers to further boost base-class and new-class performance, and it demonstrates strong improvements across 11 datasets, achieving about a 2% average accuracy gain and enhanced base-to-new discriminability (AUROC) over baselines. The approach offers a principled, modular path to robust prompt tuning in open-world settings, with practical runtime considerations and potential for future integration of richer OOD signals.

Abstract

Vision-language models (VLMs), such as CLIP, have demonstrated impressive zero-shot capabilities for various downstream tasks. Their performance can be further enhanced through few-shot prompt tuning methods. However, current studies evaluate the performance of learned prompts separately on base and new classes. This evaluation lacks practicality for real-world applications since downstream tasks cannot determine whether the data belongs to base or new classes in advance. In this paper, we explore a problem setting called Open-world Prompt Tuning (OPT), which involves tuning prompts on base classes and evaluating on a combination of base and new classes. By introducing Decomposed Prompt Tuning framework (DePT), we theoretically demonstrate that OPT can be solved by incorporating out-of-distribution detection into prompt tuning, thereby enhancing the base-to-new discriminability. Based on DePT, we present a novel prompt tuning approach, namely, Decomposed Context Optimization (DeCoOp), which introduces new-class detectors and sub-classifiers to further enhance the base-class and new-class discriminability. Experimental results on 11 benchmark datasets validate the effectiveness of DePT and demonstrate that DeCoOp outperforms current state-of-the-art methods, providing a significant 2% average accuracy improvement.
Paper Structure (34 sections, 1 theorem, 15 equations, 9 figures, 10 tables)

This paper contains 34 sections, 1 theorem, 15 equations, 9 figures, 10 tables.

Key Result

Theorem 2.1

If $\mathbb{E}_{\boldsymbol{x}} \left [H^{\textsc{Cls}}_{\textsc{Zs}}(\boldsymbol{x}) \right ] \leq \delta$ for $\boldsymbol{x}$ belonging to both base and new classes, $\mathbb{E}_{\boldsymbol{x}} \left [H^{\textsc{Cls}}_{\textsc{Pt}}(\boldsymbol{x}) \right ] \leq \delta - \Delta$ for $\boldsymbol{

Figures (9)

  • Figure 1: An illustration of the OPT evaluation paradigm. During the training, we finetune the model with data from base classes. During the testing, we evaluate the model on a mix of base and new classes.
  • Figure 2: Delta performance of CoOp and Ship method compared to zero-shot baseline Clip method. Subfigres (a) and (b) show that the changes in the H metric are not necessary indicators of performance improvements or degradation of accuracy, highlighting the significance of addressing the OPT problem.
  • Figure 3: Performance of Zs and Pt methods to distinguish data from base classes and new classes (base-to-new discriminability).
  • Figure 4: Performance of Zs and Pt methods to distinguish data within new classes (new-class discriminability).
  • Figure 5: The overall illustration of DeCoOp approach.
  • ...and 4 more figures

Theorems & Definitions (3)

  • Theorem 2.1
  • Remark 2.2
  • proof