Editable Concept Bottleneck Models
Lijie Hu, Chenyang Ren, Zhengyu Hu, Hongbin Lin, Cheng-Long Wang, Hui Xiong, Jingfeng Zhang, Di Wang
TL;DR
Editable Concept Bottleneck Models (ECBMs) address the need to delete or insert data and concepts in Concept Bottleneck Models without costly retraining. By leveraging influence functions and EK-FAC, ECBMs provide closed-form approximations for three editing levels—concept-label-level, concept-level, and data-level—with theoretical error bounds. Across OAI, CUB, and CelebA, ECBMs achieve near retraining accuracy while delivering substantial runtime reductions and enhanced interpretability of concept importance. This work enables privacy-preserving edits, rapid concept updates, and robust unlearning in large-scale CBMs with practical implications for interactive, trustworthy AI systems in critical domains.
Abstract
Concept Bottleneck Models (CBMs) have garnered much attention for their ability to elucidate the prediction process through a humanunderstandable concept layer. However, most previous studies focused on cases where the data, including concepts, are clean. In many scenarios, we often need to remove/insert some training data or new concepts from trained CBMs for reasons such as privacy concerns, data mislabelling, spurious concepts, and concept annotation errors. Thus, deriving efficient editable CBMs without retraining from scratch remains a challenge, particularly in large-scale applications. To address these challenges, we propose Editable Concept Bottleneck Models (ECBMs). Specifically, ECBMs support three different levels of data removal: concept-label-level, concept-level, and data-level. ECBMs enjoy mathematically rigorous closed-form approximations derived from influence functions that obviate the need for retraining. Experimental results demonstrate the efficiency and adaptability of our ECBMs, affirming their practical value in CBMs.
