Table of Contents
Fetching ...

ProTeCt: Prompt Tuning for Taxonomic Open Set Classification

Tz-Ying Wu, Chih-Hui Ho, Nuno Vasconcelos

TL;DR

ProTeCt addresses the challenge of taxonomic open set (TOS) classification, where predictions must be consistent across multiple hierarchical levels. The authors introduce two metrics, hierarchical consistent accuracy (HCA) and mean treecut accuracy (MTA), to evaluate hierarchical reliability beyond leaf-level accuracy. They propose ProTeCt, a plug-in prompt-tuning framework that jointly optimizes a node-centric loss and a dynamic treecut loss to enforce consistency across a taxonomy during training, while preserving leaf accuracy. Experiments show substantial gains in HCA and MTA across CIFAR-100, SUN, and ImageNet, with successful domain generalization to unseen image domains and compatibility with various CLIP architectures, prompt methods, and adapters. The work provides a practical path to reliable, multi-granularity classification in real-world applications that require predictions at different taxonomic levels.

Abstract

Visual-language foundation models, like CLIP, learn generalized representations that enable zero-shot open-set classification. Few-shot adaptation methods, based on prompt tuning, have been shown to further improve performance on downstream datasets. However, these methods do not fare well in the taxonomic open set (TOS) setting, where the classifier is asked to make predictions from label sets across different levels of semantic granularity. Frequently, they infer incorrect labels at coarser taxonomic class levels, even when the inference at the leaf level (original class labels) is correct. To address this problem, we propose a prompt tuning technique that calibrates the hierarchical consistency of model predictions. A set of metrics of hierarchical consistency, the Hierarchical Consistent Accuracy (HCA) and the Mean Treecut Accuracy (MTA), are first proposed to evaluate TOS model performance. A new Prompt Tuning for Hierarchical Consistency (ProTeCt) technique is then proposed to calibrate classification across label set granularities. Results show that ProTeCt can be combined with existing prompt tuning methods to significantly improve TOS classification without degrading the leaf level classification performance.

ProTeCt: Prompt Tuning for Taxonomic Open Set Classification

TL;DR

ProTeCt addresses the challenge of taxonomic open set (TOS) classification, where predictions must be consistent across multiple hierarchical levels. The authors introduce two metrics, hierarchical consistent accuracy (HCA) and mean treecut accuracy (MTA), to evaluate hierarchical reliability beyond leaf-level accuracy. They propose ProTeCt, a plug-in prompt-tuning framework that jointly optimizes a node-centric loss and a dynamic treecut loss to enforce consistency across a taxonomy during training, while preserving leaf accuracy. Experiments show substantial gains in HCA and MTA across CIFAR-100, SUN, and ImageNet, with successful domain generalization to unseen image domains and compatibility with various CLIP architectures, prompt methods, and adapters. The work provides a practical path to reliable, multi-granularity classification in real-world applications that require predictions at different taxonomic levels.

Abstract

Visual-language foundation models, like CLIP, learn generalized representations that enable zero-shot open-set classification. Few-shot adaptation methods, based on prompt tuning, have been shown to further improve performance on downstream datasets. However, these methods do not fare well in the taxonomic open set (TOS) setting, where the classifier is asked to make predictions from label sets across different levels of semantic granularity. Frequently, they infer incorrect labels at coarser taxonomic class levels, even when the inference at the leaf level (original class labels) is correct. To address this problem, we propose a prompt tuning technique that calibrates the hierarchical consistency of model predictions. A set of metrics of hierarchical consistency, the Hierarchical Consistent Accuracy (HCA) and the Mean Treecut Accuracy (MTA), are first proposed to evaluate TOS model performance. A new Prompt Tuning for Hierarchical Consistency (ProTeCt) technique is then proposed to calibrate classification across label set granularities. Results show that ProTeCt can be combined with existing prompt tuning methods to significantly improve TOS classification without degrading the leaf level classification performance.
Paper Structure (33 sections, 1 theorem, 11 equations, 11 figures, 26 tables, 1 algorithm)

This paper contains 33 sections, 1 theorem, 11 equations, 11 figures, 26 tables, 1 algorithm.

Key Result

Lemma 4.1

For a balanced M-ary tree with depth $L$ (root node is excluded and is at depth 0), the number of all valid treecut is $L + \sum_{l=2}^L \sum_{k=1}^{N-1} \frac{N!}{k!(N-k)!} |_{N=M^{l-1}}$.

Figures (11)

  • Figure 1: (Top) An example of class hierarchy, where CLIP predicts the tiger image as "person" at the internal hierarchy level. (Bottom) Correct/incorrect model predictions (green/red) of CoOp w/ and w/o ProTeCt on ImageNet variants. $L$ denotes the tree level.
  • Figure 2: (Left) Multiple possible label sets are available in a class hierarchy. The label set can cover nodes at same level or across different hierarchy levels. (Right) Predefined matrices for efficient treecut sampling used in Algorithm \ref{['algo:treecut_sampler']}.
  • Figure 3: Relative gain/loss after adding ProTeCt to CoOp and MaPle, respectively. (Top) HCA ; (Bottom) $Acc_{leaf}$.
  • Figure 4: Ablation of (a) tree dropout rate $\beta$, (b) NCL strength $\lambda$ and (c) CLIP ViT B32 architecture.
  • Figure 5: ProTeCt correctly predicts examples from ImageNet (a,b) and its variants (c,d) at all levels. [GT, Prediction] shows the groundtruth and incorrect prediction by vanilla prompt tuning.
  • ...and 6 more figures

Theorems & Definitions (2)

  • Lemma 4.1
  • proof