Table of Contents
Fetching ...

Semi-supervised Concept Bottleneck Models

Lijie Hu, Tianhao Huang, Huanyi Xie, Xilin Gong, Chenyang Ren, Zhengyu Hu, Lu Yu, Ping Ma, Di Wang

TL;DR

This work addresses the dependency of Concept Bottleneck Models on costly concept annotations and misalignment between concept saliency and input features. It introduces SSCBM, a semi-supervised framework that leverages unlabeled data via KNN-based pseudo-labels and refines them with an image–concept alignment heatmap, jointly training with labeled data through a final objective $\mathcal{L} = \mathcal{L}_{task} + \lambda_1 \mathcal{L}_c + \lambda_2 \mathcal{L}_{align}$. By combining a Label Anchor with a Concept Embedding Encoder and Unlabel Alignment via heatmaps, SSCBM delivers strong concept and task accuracy with as little as 10% labeled data, closely matching fully supervised baselines across four datasets and enabling test-time intervention for improved interpretability. The method advances practical deployment of CBMs by improving annotation efficiency and faithfulness of explanations, with clear benefits for domains requiring transparent AI systems. All mathematical components are expressed with $...$ notation to ensure precise specification.

Abstract

Concept Bottleneck Models (CBMs) have garnered increasing attention due to their ability to provide concept-based explanations for black-box deep learning models while achieving high final prediction accuracy using human-like concepts. However, the training of current CBMs is heavily dependent on the precision and richness of the annotated concepts in the dataset. These concept labels are typically provided by experts, which can be costly and require significant resources and effort. Additionally, concept saliency maps frequently misalign with input saliency maps, causing concept predictions to correspond to irrelevant input features - an issue related to annotation alignment. To address these limitations, we propose a new framework called SSCBM (Semi-supervised Concept Bottleneck Model). Our SSCBM is suitable for practical situations where annotated data is scarce. By leveraging joint training on both labeled and unlabeled data and aligning the unlabeled data at the concept level, we effectively solve these issues. We proposed a strategy to generate pseudo labels and an alignment loss. Experiments demonstrate that our SSCBM is both effective and efficient. With only 10% labeled data, our model's concept and task accuracy on average across four datasets is only 2.44% and 3.93% lower, respectively, compared to the best baseline in the fully supervised learning setting.

Semi-supervised Concept Bottleneck Models

TL;DR

This work addresses the dependency of Concept Bottleneck Models on costly concept annotations and misalignment between concept saliency and input features. It introduces SSCBM, a semi-supervised framework that leverages unlabeled data via KNN-based pseudo-labels and refines them with an image–concept alignment heatmap, jointly training with labeled data through a final objective . By combining a Label Anchor with a Concept Embedding Encoder and Unlabel Alignment via heatmaps, SSCBM delivers strong concept and task accuracy with as little as 10% labeled data, closely matching fully supervised baselines across four datasets and enabling test-time intervention for improved interpretability. The method advances practical deployment of CBMs by improving annotation efficiency and faithfulness of explanations, with clear benefits for domains requiring transparent AI systems. All mathematical components are expressed with notation to ensure precise specification.

Abstract

Concept Bottleneck Models (CBMs) have garnered increasing attention due to their ability to provide concept-based explanations for black-box deep learning models while achieving high final prediction accuracy using human-like concepts. However, the training of current CBMs is heavily dependent on the precision and richness of the annotated concepts in the dataset. These concept labels are typically provided by experts, which can be costly and require significant resources and effort. Additionally, concept saliency maps frequently misalign with input saliency maps, causing concept predictions to correspond to irrelevant input features - an issue related to annotation alignment. To address these limitations, we propose a new framework called SSCBM (Semi-supervised Concept Bottleneck Model). Our SSCBM is suitable for practical situations where annotated data is scarce. By leveraging joint training on both labeled and unlabeled data and aligning the unlabeled data at the concept level, we effectively solve these issues. We proposed a strategy to generate pseudo labels and an alignment loss. Experiments demonstrate that our SSCBM is both effective and efficient. With only 10% labeled data, our model's concept and task accuracy on average across four datasets is only 2.44% and 3.93% lower, respectively, compared to the best baseline in the fully supervised learning setting.

Paper Structure

This paper contains 22 sections, 10 equations, 19 figures, 5 tables.

Figures (19)

  • Figure 1: (a) A sample of sparrow class with complete concept labels. (b) A sample of sparrow class with incomplete concept labels. (c) A sample of misalignment between input features and concepts resulting from existing CBM methods. Our framework simultaneously utilizes both (a) and (b) types of data and addresses the issue of (c) through an alignment loss.
  • Figure 2: Overall framework of our proposed SSCBM.
  • Figure 3: The concept saliency map for the CUB dataset (savannah sparrow) demonstrates that our proposed SSCBM achieves meaningful alignment between the ground truth concepts and the input image features. The first image on the left is the original input image. The three images on the right show the aligned regions for different concepts using SSCBM. The text below each image indicates the specific concept, the ground truth concept label, and the prediction result given by SSCBM.
  • Figure 4: Left: Performance with different ratios of intervened concepts on CUB dataset. Right: An example of successful intervention.
  • Figure 5: Test-time Intervention on CUB and AwA2 dataset.
  • ...and 14 more figures