Table of Contents
Fetching ...

Editable Concept Bottleneck Models

Lijie Hu, Chenyang Ren, Zhengyu Hu, Hongbin Lin, Cheng-Long Wang, Hui Xiong, Jingfeng Zhang, Di Wang

TL;DR

Editable Concept Bottleneck Models (ECBMs) address the need to delete or insert data and concepts in Concept Bottleneck Models without costly retraining. By leveraging influence functions and EK-FAC, ECBMs provide closed-form approximations for three editing levels—concept-label-level, concept-level, and data-level—with theoretical error bounds. Across OAI, CUB, and CelebA, ECBMs achieve near retraining accuracy while delivering substantial runtime reductions and enhanced interpretability of concept importance. This work enables privacy-preserving edits, rapid concept updates, and robust unlearning in large-scale CBMs with practical implications for interactive, trustworthy AI systems in critical domains.

Abstract

Concept Bottleneck Models (CBMs) have garnered much attention for their ability to elucidate the prediction process through a humanunderstandable concept layer. However, most previous studies focused on cases where the data, including concepts, are clean. In many scenarios, we often need to remove/insert some training data or new concepts from trained CBMs for reasons such as privacy concerns, data mislabelling, spurious concepts, and concept annotation errors. Thus, deriving efficient editable CBMs without retraining from scratch remains a challenge, particularly in large-scale applications. To address these challenges, we propose Editable Concept Bottleneck Models (ECBMs). Specifically, ECBMs support three different levels of data removal: concept-label-level, concept-level, and data-level. ECBMs enjoy mathematically rigorous closed-form approximations derived from influence functions that obviate the need for retraining. Experimental results demonstrate the efficiency and adaptability of our ECBMs, affirming their practical value in CBMs.

Editable Concept Bottleneck Models

TL;DR

Editable Concept Bottleneck Models (ECBMs) address the need to delete or insert data and concepts in Concept Bottleneck Models without costly retraining. By leveraging influence functions and EK-FAC, ECBMs provide closed-form approximations for three editing levels—concept-label-level, concept-level, and data-level—with theoretical error bounds. Across OAI, CUB, and CelebA, ECBMs achieve near retraining accuracy while delivering substantial runtime reductions and enhanced interpretability of concept importance. This work enables privacy-preserving edits, rapid concept updates, and robust unlearning in large-scale CBMs with practical implications for interactive, trustworthy AI systems in critical domains.

Abstract

Concept Bottleneck Models (CBMs) have garnered much attention for their ability to elucidate the prediction process through a humanunderstandable concept layer. However, most previous studies focused on cases where the data, including concepts, are clean. In many scenarios, we often need to remove/insert some training data or new concepts from trained CBMs for reasons such as privacy concerns, data mislabelling, spurious concepts, and concept annotation errors. Thus, deriving efficient editable CBMs without retraining from scratch remains a challenge, particularly in large-scale applications. To address these challenges, we propose Editable Concept Bottleneck Models (ECBMs). Specifically, ECBMs support three different levels of data removal: concept-label-level, concept-level, and data-level. ECBMs enjoy mathematically rigorous closed-form approximations derived from influence functions that obviate the need for retraining. Experimental results demonstrate the efficiency and adaptability of our ECBMs, affirming their practical value in CBMs.
Paper Structure (40 sections, 21 theorems, 210 equations, 24 figures, 2 tables)

This paper contains 40 sections, 21 theorems, 210 equations, 24 figures, 2 tables.

Key Result

Theorem 4.2

The retrained concept predictor $\hat{g}_{e}$ defined by (concept-label:g) can be approximated by $\bar{g}_{e}$, defined by: where $H_{\hat{g}} = \nabla_{\hat{g}} \sum_{i,j} G^j_C(x_i,{c}_i;\hat{g})$ is the Hessian matrix of the loss function with respect to $\hat{g}$.

Figures (24)

  • Figure 1: An illustration of Editable Concept Bottleneck Models with three settings.
  • Figure 2: Impact of edition ratio on three settings on CUB dataset.
  • Figure 3: F1 score difference after removing most and least influential concepts given by ECBM.
  • Figure 4: RMIA scores of data before and after removal.
  • Figure 5: Concept-label-level ECBM
  • ...and 19 more figures

Theorems & Definitions (39)

  • Definition 4.1
  • Theorem 4.2
  • Theorem 4.3
  • Theorem 4.4
  • Lemma 4.5
  • Theorem 4.6
  • Theorem 4.7
  • Theorem 4.8
  • Theorem 4.1
  • proof
  • ...and 29 more