Table of Contents
Fetching ...

Explain via Any Concept: Concept Bottleneck Model with Open Vocabulary Concepts

Andong Tan, Fengtao Zhou, Hao Chen

TL;DR

OpenCBM tackles the fixed-concept limitation of conventional concept bottleneck models by enabling open vocabulary concepts through prototype-based alignment with CLIP and reconstruction of the prediction head from user-provided text concepts. The method jointly trains a CNN-based feature extractor and a classifier, aligning its features to CLIP prototypes, and supports iterative discovery of missing concepts to recover performance, as well as post-training concept removal/addition/replacement. It achieves a notable 9% accuracy improvement on CUB-200-2011 over prior CBMs and demonstrates unique post-deployment interpretability and knowledge-discovery capabilities, while remaining agnostic to the underlying vision-language model. The approach broadens the practical utility of CBMs by enabling flexible, human-guided explanations and edits without re-training, with potential applicability to other vision-language pre-trained models.

Abstract

The concept bottleneck model (CBM) is an interpretable-by-design framework that makes decisions by first predicting a set of interpretable concepts, and then predicting the class label based on the given concepts. Existing CBMs are trained with a fixed set of concepts (concepts are either annotated by the dataset or queried from language models). However, this closed-world assumption is unrealistic in practice, as users may wonder about the role of any desired concept in decision-making after the model is deployed. Inspired by the large success of recent vision-language pre-trained models such as CLIP in zero-shot classification, we propose "OpenCBM" to equip the CBM with open vocabulary concepts via: (1) Aligning the feature space of a trainable image feature extractor with that of a CLIP's image encoder via a prototype based feature alignment; (2) Simultaneously training an image classifier on the downstream dataset; (3) Reconstructing the trained classification head via any set of user-desired textual concepts encoded by CLIP's text encoder. To reveal potentially missing concepts from users, we further propose to iteratively find the closest concept embedding to the residual parameters during the reconstruction until the residual is small enough. To the best of our knowledge, our "OpenCBM" is the first CBM with concepts of open vocabularies, providing users the unique benefit such as removing, adding, or replacing any desired concept to explain the model's prediction even after a model is trained. Moreover, our model significantly outperforms the previous state-of-the-art CBM by 9% in the classification accuracy on the benchmark dataset CUB-200-2011.

Explain via Any Concept: Concept Bottleneck Model with Open Vocabulary Concepts

TL;DR

OpenCBM tackles the fixed-concept limitation of conventional concept bottleneck models by enabling open vocabulary concepts through prototype-based alignment with CLIP and reconstruction of the prediction head from user-provided text concepts. The method jointly trains a CNN-based feature extractor and a classifier, aligning its features to CLIP prototypes, and supports iterative discovery of missing concepts to recover performance, as well as post-training concept removal/addition/replacement. It achieves a notable 9% accuracy improvement on CUB-200-2011 over prior CBMs and demonstrates unique post-deployment interpretability and knowledge-discovery capabilities, while remaining agnostic to the underlying vision-language model. The approach broadens the practical utility of CBMs by enabling flexible, human-guided explanations and edits without re-training, with potential applicability to other vision-language pre-trained models.

Abstract

The concept bottleneck model (CBM) is an interpretable-by-design framework that makes decisions by first predicting a set of interpretable concepts, and then predicting the class label based on the given concepts. Existing CBMs are trained with a fixed set of concepts (concepts are either annotated by the dataset or queried from language models). However, this closed-world assumption is unrealistic in practice, as users may wonder about the role of any desired concept in decision-making after the model is deployed. Inspired by the large success of recent vision-language pre-trained models such as CLIP in zero-shot classification, we propose "OpenCBM" to equip the CBM with open vocabulary concepts via: (1) Aligning the feature space of a trainable image feature extractor with that of a CLIP's image encoder via a prototype based feature alignment; (2) Simultaneously training an image classifier on the downstream dataset; (3) Reconstructing the trained classification head via any set of user-desired textual concepts encoded by CLIP's text encoder. To reveal potentially missing concepts from users, we further propose to iteratively find the closest concept embedding to the residual parameters during the reconstruction until the residual is small enough. To the best of our knowledge, our "OpenCBM" is the first CBM with concepts of open vocabularies, providing users the unique benefit such as removing, adding, or replacing any desired concept to explain the model's prediction even after a model is trained. Moreover, our model significantly outperforms the previous state-of-the-art CBM by 9% in the classification accuracy on the benchmark dataset CUB-200-2011.
Paper Structure (18 sections, 10 equations, 6 figures, 2 tables, 1 algorithm)

This paper contains 18 sections, 10 equations, 6 figures, 2 tables, 1 algorithm.

Figures (6)

  • Figure 1: Existing concept bottleneck models are trained with a fixed set of concepts, limiting the users to understand the model's reasoning only according to a static set of concepts. Our OpenCBM offers open vocabulary concepts, allowing users to flexibly choose any set of concepts for the model reasoning without re-training a new model.
  • Figure 2: Our method works by training a standard network (e.g., ResNet) while aligning its feature space with that of the CLIP's feature space via a class prototype based feature space alignment. At inference time, we propose to leverage any user desired concept set (e.g., $k$ cocepts ) encoded by CLIP to reconstruct the trained classification head $\mathbf{v}_c$. The residual parameters after the reconstruction encode the missing concepts not queried by the user.
  • Figure 3: Accuracy change of 20 classes after removing the class name "Sooty Albatross" using the technique of section \ref{['sec:remove_from_unknown']}. The third pair of bars shows a large accuracy drop in "Sooty Albatross". The changes in other classes demonstrate the inference correlation between classes, which are positively or negatively correlated with the given class.
  • Figure 4: Visualizations of the learned concept importance for different classes. Original concept set has on average 3 concepts per class. After the adding, each class has 5 concepts. Concepts in red colors are newly added.
  • Figure 5: Illustration of an inference process after adding more concepts. Concepts in bold are generated via asking LLM to generate relevant concepts to the class "yellow-breasted chat". Red colors indicate newly added concepts.
  • ...and 1 more figures