Table of Contents
Fetching ...

Learning to Obstruct Few-Shot Image Classification over Restricted Classes

Amber Yijia Zheng, Chiao-An Yang, Raymond A. Yeh

TL;DR

This paper tackles the security risk of openly released pre-trained backbones by asking whether a model can be meta-learned to obstruct fine-tuning for a subset of downstream tasks. It introduces Learning to Obstruct (LTO), a gradient-based, MAML-like framework that learns a poor initialization for a backbone with respect to restricted classes in few-shot classification, while preserving performance on non-restricted classes. The authors demonstrate LTO’s effectiveness across classical FSC methods (ProtoNet, MetaOptNet), CLIP-based FSC (CoOp, Tip-Adapter, CE), and CLIP-based attribute learning on ImageNet, CIFAR100, and CelebA, using a consistent obstruction metric DropRatio@β and careful dataset splits. The results show that LTO can significantly degrade restricted-class performance with minimal collateral damage to other classes, supporting a safety-oriented approach to releasing open-source models and motivating further exploration of obstruction-aware training. Overall, LTO represents a promising step toward safer open backbones by preemptively hindering unwanted fine-tuning while maintaining broad utility for legitimate tasks.

Abstract

Advancements in open-source pre-trained backbones make it relatively easy to fine-tune a model for new tasks. However, this lowered entry barrier poses potential risks, e.g., bad actors developing models for harmful applications. A question arises: Is possible to develop a pre-trained model that is difficult to fine-tune for certain downstream tasks? To begin studying this, we focus on few-shot classification (FSC). Specifically, we investigate methods to make FSC more challenging for a set of restricted classes while maintaining the performance of other classes. We propose to meta-learn over the pre-trained backbone in a manner that renders it a ''poor initialization''. Our proposed Learning to Obstruct (LTO) algorithm successfully obstructs four FSC methods across three datasets, including ImageNet and CIFAR100 for image classification, as well as CelebA for attribute classification.

Learning to Obstruct Few-Shot Image Classification over Restricted Classes

TL;DR

This paper tackles the security risk of openly released pre-trained backbones by asking whether a model can be meta-learned to obstruct fine-tuning for a subset of downstream tasks. It introduces Learning to Obstruct (LTO), a gradient-based, MAML-like framework that learns a poor initialization for a backbone with respect to restricted classes in few-shot classification, while preserving performance on non-restricted classes. The authors demonstrate LTO’s effectiveness across classical FSC methods (ProtoNet, MetaOptNet), CLIP-based FSC (CoOp, Tip-Adapter, CE), and CLIP-based attribute learning on ImageNet, CIFAR100, and CelebA, using a consistent obstruction metric DropRatio@β and careful dataset splits. The results show that LTO can significantly degrade restricted-class performance with minimal collateral damage to other classes, supporting a safety-oriented approach to releasing open-source models and motivating further exploration of obstruction-aware training. Overall, LTO represents a promising step toward safer open backbones by preemptively hindering unwanted fine-tuning while maintaining broad utility for legitimate tasks.

Abstract

Advancements in open-source pre-trained backbones make it relatively easy to fine-tune a model for new tasks. However, this lowered entry barrier poses potential risks, e.g., bad actors developing models for harmful applications. A question arises: Is possible to develop a pre-trained model that is difficult to fine-tune for certain downstream tasks? To begin studying this, we focus on few-shot classification (FSC). Specifically, we investigate methods to make FSC more challenging for a set of restricted classes while maintaining the performance of other classes. We propose to meta-learn over the pre-trained backbone in a manner that renders it a ''poor initialization''. Our proposed Learning to Obstruct (LTO) algorithm successfully obstructs four FSC methods across three datasets, including ImageNet and CIFAR100 for image classification, as well as CelebA for attribute classification.
Paper Structure (21 sections, 8 equations, 11 figures, 16 tables, 1 algorithm)

This paper contains 21 sections, 8 equations, 11 figures, 16 tables, 1 algorithm.

Figures (11)

  • Figure 1: Learning to Obstruct (LTO) few-shot learning paradigm.foowolto Without LTO: after the adaptation of few-shot learner $F$, the model can classify classes from ${\mathcal{R}}$ and ${\mathcal{R}}'$ correctly. fooOursColor With LTO: By modifying the pre-trained model parameters $\theta^p$ via our proposed method ${\bm{\mathsfit{A}}}$ before the adaptation of $F$, the model fails to generalize to restricted class set ${\mathcal{R}}$ while maintaining its performance in other class set ${\mathcal{R}}'$.
  • Figure 1: DropRatio of LTO on classical few-shot learning. We report $\Delta@2$ on ImageNet over 9 selected superclasses. All experiments are 5-way classification.
  • Figure 2: Selection of ${\mathcal{R}}$ on image classification. The objective of LTO on image classification is to minimize the top-1 Acc. on ${\mathcal{R}}$ while maintaining the top-1 Acc. on ${\mathcal{R}}'$. In this example, ${\mathcal{R}} = {\mathcal{Y}}_{\text{device}}$ while ${\mathcal{R}}' = {\mathcal{Y}}_{\text{bird}} \bigcup {\mathcal{Y}}_{\text{dog}}$. LTO decreases the top-1 Acc. on ${\mathcal{R}}$ from 92% to 66%, while the drops on ${\mathcal{R}}'$ are no more than 3%.
  • Figure 2: DropRatio of Clip-based LTO on CIFAR100 and ImageNet. We report $\Delta@2$ on CIFAR100 and ImageNet over 9 selected superclasses. Note, superclasses are not the same across datasets.
  • Figure 3: Selection of ${\mathcal{R}}$ on attribute learning. The objective of LTO on attribute learning is to minimize the AUROC on ${\mathcal{R}}$ while maintaining the performance on ${\mathcal{R}}'$. In this example, ${\mathcal{R}} = \{ \text{bald} \}$ while ${\mathcal{R}}'=\{ \text{hat}, \text{glasses} \}$. LTO decreases the AUROC on ${\mathcal{R}}$ from 77% to 63%, while the drops on ${\mathcal{R}}'$ are no more than 2%.
  • ...and 6 more figures