Table of Contents
Fetching ...

Black-Box Forgetting

Yusuke Kuwana, Yuta Goto, Takashi Shibata, Go Irie

TL;DR

A novel problem of selective forgetting for black-box models, named Black-Box Forgetting, is addressed and an approach to the problem is proposed, called Latent Context Sharing, which introduces common low-dimensional latent components among multiple tokens for the prompt.

Abstract

Large-scale pre-trained models (PTMs) provide remarkable zero-shot classification capability covering a wide variety of object classes. However, practical applications do not always require the classification of all kinds of objects, and leaving the model capable of recognizing unnecessary classes not only degrades overall accuracy but also leads to operational disadvantages. To mitigate this issue, we explore the selective forgetting problem for PTMs, where the task is to make the model unable to recognize only the specified classes while maintaining accuracy for the rest. All the existing methods assume "white-box" settings, where model information such as architectures, parameters, and gradients is available for training. However, PTMs are often "black-box," where information on such models is unavailable for commercial reasons or social responsibilities. In this paper, we address a novel problem of selective forgetting for black-box models, named Black-Box Forgetting, and propose an approach to the problem. Given that information on the model is unavailable, we optimize the input prompt to decrease the accuracy of specified classes through derivative-free optimization. To avoid difficult high-dimensional optimization while ensuring high forgetting performance, we propose Latent Context Sharing, which introduces common low-dimensional latent components among multiple tokens for the prompt. Experiments on four standard benchmark datasets demonstrate the superiority of our method with reasonable baselines. The code is available at https://github.com/yusukekwn/Black-Box-Forgetting.

Black-Box Forgetting

TL;DR

A novel problem of selective forgetting for black-box models, named Black-Box Forgetting, is addressed and an approach to the problem is proposed, called Latent Context Sharing, which introduces common low-dimensional latent components among multiple tokens for the prompt.

Abstract

Large-scale pre-trained models (PTMs) provide remarkable zero-shot classification capability covering a wide variety of object classes. However, practical applications do not always require the classification of all kinds of objects, and leaving the model capable of recognizing unnecessary classes not only degrades overall accuracy but also leads to operational disadvantages. To mitigate this issue, we explore the selective forgetting problem for PTMs, where the task is to make the model unable to recognize only the specified classes while maintaining accuracy for the rest. All the existing methods assume "white-box" settings, where model information such as architectures, parameters, and gradients is available for training. However, PTMs are often "black-box," where information on such models is unavailable for commercial reasons or social responsibilities. In this paper, we address a novel problem of selective forgetting for black-box models, named Black-Box Forgetting, and propose an approach to the problem. Given that information on the model is unavailable, we optimize the input prompt to decrease the accuracy of specified classes through derivative-free optimization. To avoid difficult high-dimensional optimization while ensuring high forgetting performance, we propose Latent Context Sharing, which introduces common low-dimensional latent components among multiple tokens for the prompt. Experiments on four standard benchmark datasets demonstrate the superiority of our method with reasonable baselines. The code is available at https://github.com/yusukekwn/Black-Box-Forgetting.

Paper Structure

This paper contains 36 sections, 3 equations, 9 figures, 12 tables.

Figures (9)

  • Figure 1: Overview of our black-box forgetting framework. The confidence of each class is computed as the similarity with the image and class (text) embeddings from the black-box pre-trained vision-language model (e.g., CLIP). The obtained confidence is used to compute the respective loss functions for the classes to be forgotten and the classes to be memorized. (a) For the classes to be forgotten, maximize the entropy of the confidence so that the accuracy is reduced. (b) For the classes to be memorized, minimize the cross-entropy loss to retain the accuracy. These two objective are jointly optimized to tune the learnable text prompt. The gradients of the objective are not available when the model is black-box. We therefore use CMA-ES hansen2003reducing, a derivative-free optimizer, to learn the text prompt. Instead of directly optimizing the original high-dimensional context (token) embeddings for the prompt, our method learns lower-dimensional latent contexts for mitigating the difficulty of high-dimensional optimization.
  • Figure 2: Vanilla
  • Figure 3: BBT
  • Figure 4: LCS
  • Figure 6: Sensitivity to the number of latent contexts. Results of BBT sun2022black and Ours for varying number of the latent contexts. We can see that our method shows stable performance within a wide range of the number of latent contexts in contrast BBT.
  • ...and 4 more figures