Towards Faithful Multimodal Concept Bottleneck Models

Pierre Moreau; Emeline Pineau Ferrand; Yann Choho; Benjamin Wong; Annabelle Blangero; Milan Bhan

Towards Faithful Multimodal Concept Bottleneck Models

Pierre Moreau, Emeline Pineau Ferrand, Yann Choho, Benjamin Wong, Annabelle Blangero, Milan Bhan

Abstract

Concept Bottleneck Models (CBMs) are interpretable models that route predictions through a layer of human-interpretable concepts. While widely studied in vision and, more recently, in NLP, CBMs remain largely unexplored in multimodal settings. For their explanations to be faithful, CBMs must satisfy two conditions: concepts must be properly detected, and concept representations must encode only their intended semantics, without smuggling extraneous task-relevant or inter-concept information into final predictions, a phenomenon known as leakage. Existing approaches treat concept detection and leakage mitigation as separate problems, and typically improve one at the expense of predictive accuracy. In this work, we introduce f-CBM, a faithful multimodal CBM framework built on a vision-language backbone that jointly targets both aspects through two complementary strategies: a differentiable leakage loss to mitigate leakage, and a Kolmogorov-Arnold Network prediction head that provides sufficient expressiveness to improve concept detection. Experiments demonstrate that f-CBM achieves the best trade-off between task accuracy, concept detection, and leakage reduction, while applying seamlessly to both image and text or text-only datasets, making it versatile across modalities.

Towards Faithful Multimodal Concept Bottleneck Models

Abstract

Paper Structure (31 sections, 7 equations, 7 figures, 2 tables)

This paper contains 31 sections, 7 equations, 7 figures, 2 tables.

Introduction
Background and Related Work
Background
Concept-based XAI.
Concept Bottleneck Models.
Related Work
Multimodality and Concept Bottleneck Models.
Addressing CBM Faithfulness.
Preliminary Analysis: mCBM Faithfulness Factors Interplay
A Baseline Joint mCBM Implementation
Dataset
Dataset Concept Annotation
mCBM Training
Studied mCBM Faithfulness Metrics
Findings
...and 16 more sections

Figures (7)

Figure 1: Pareto frontier: concept detection accuracy versus aggregate leakage. The x-axis represents the average of task-related and inter-concept leakage as introduced in Section \ref{['bk_rw']}, and the y-axis represents RMSE concept detection performance.
Figure 2: Leakage analysis in multimodal CBMs
Figure 3: Overview of f-CBM, illustrated on an instance from the N24News dataset belonging to the Sport category.
Figure 4: Ablation study of f-CBM: effect of the KAN layer and the leakage loss on task accuracy, concept RMSE, and leakage (N24 dataset, CLIP-base backbone).
Figure 5: Effect of the leakage loss on concept activation distributions. Early in training (left), activations separate by predicted class, revealing concept-task leakage. Later (right), concept detection improved while the leakage loss reduces class information encoded in the activations, mitigating leakage.
...and 2 more figures

Towards Faithful Multimodal Concept Bottleneck Models

Abstract

Towards Faithful Multimodal Concept Bottleneck Models

Authors

Abstract

Table of Contents

Figures (7)