Editable Concept Bottleneck Models

Lijie Hu; Chenyang Ren; Zhengyu Hu; Hongbin Lin; Cheng-Long Wang; Hui Xiong; Jingfeng Zhang; Di Wang

Editable Concept Bottleneck Models

Lijie Hu, Chenyang Ren, Zhengyu Hu, Hongbin Lin, Cheng-Long Wang, Hui Xiong, Jingfeng Zhang, Di Wang

TL;DR

Editable Concept Bottleneck Models (ECBMs) address the need to delete or insert data and concepts in Concept Bottleneck Models without costly retraining. By leveraging influence functions and EK-FAC, ECBMs provide closed-form approximations for three editing levels—concept-label-level, concept-level, and data-level—with theoretical error bounds. Across OAI, CUB, and CelebA, ECBMs achieve near retraining accuracy while delivering substantial runtime reductions and enhanced interpretability of concept importance. This work enables privacy-preserving edits, rapid concept updates, and robust unlearning in large-scale CBMs with practical implications for interactive, trustworthy AI systems in critical domains.

Abstract

Concept Bottleneck Models (CBMs) have garnered much attention for their ability to elucidate the prediction process through a humanunderstandable concept layer. However, most previous studies focused on cases where the data, including concepts, are clean. In many scenarios, we often need to remove/insert some training data or new concepts from trained CBMs for reasons such as privacy concerns, data mislabelling, spurious concepts, and concept annotation errors. Thus, deriving efficient editable CBMs without retraining from scratch remains a challenge, particularly in large-scale applications. To address these challenges, we propose Editable Concept Bottleneck Models (ECBMs). Specifically, ECBMs support three different levels of data removal: concept-label-level, concept-level, and data-level. ECBMs enjoy mathematically rigorous closed-form approximations derived from influence functions that obviate the need for retraining. Experimental results demonstrate the efficiency and adaptability of our ECBMs, affirming their practical value in CBMs.

Editable Concept Bottleneck Models

TL;DR

Abstract

Paper Structure (40 sections, 21 theorems, 210 equations, 24 figures, 2 tables)

This paper contains 40 sections, 21 theorems, 210 equations, 24 figures, 2 tables.

Introduction
Related Work
Preliminaries
Editable Concept Bottleneck Models
Concept Label-level Editable CBM
Concept-level Editable CBM
Data-level Editable CBM
Experiments
Experimental Settings
Evaluation of Utility and Editing Efficiency
Results on Interpretability
Conclusion
Impact Statement
Notation Table
Influence Function
...and 25 more sections

Key Result

Theorem 4.2

The retrained concept predictor $\hat{g}_{e}$ defined by (concept-label:g) can be approximated by $\bar{g}_{e}$, defined by: where $H_{\hat{g}} = \nabla_{\hat{g}} \sum_{i,j} G^j_C(x_i,{c}_i;\hat{g})$ is the Hessian matrix of the loss function with respect to $\hat{g}$.

Figures (24)

Figure 1: An illustration of Editable Concept Bottleneck Models with three settings.
Figure 2: Impact of edition ratio on three settings on CUB dataset.
Figure 3: F1 score difference after removing most and least influential concepts given by ECBM.
Figure 4: RMIA scores of data before and after removal.
Figure 5: Concept-label-level ECBM
...and 19 more figures

Theorems & Definitions (39)

Definition 4.1
Theorem 4.2
Theorem 4.3
Theorem 4.4
Lemma 4.5
Theorem 4.6
Theorem 4.7
Theorem 4.8
Theorem 4.1
proof
...and 29 more

Editable Concept Bottleneck Models

TL;DR

Abstract

Editable Concept Bottleneck Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (24)

Theorems & Definitions (39)