Integrating Clinical Knowledge into Concept Bottleneck Models
Winnie Pang, Xueyi Ke, Satoshi Tsutsui, Bihan Wen
TL;DR
Concept Bottleneck Models (CBMs) often learn from data in ways that can propagate biases and fail under domain shifts. The paper introduces a clinical-knowledge guided CBM framework that aligns the model's concept importance with clinician priorities by perturbing each concept and using alignment losses on a measure $\Delta Y_{k,l} = |\hat{y}_k - \hat{y}_{(\hat{c}_l\to0)_k}|$, balanced by high/low importance constraints. This approach is evaluated on white blood cell and skin image datasets, demonstrating improved out-of-domain performance and better alignment with expert knowledge while maintaining interpretability. The work provides a practical method to enhance robustness and transferability of interpretable medical imaging models by incorporating domain-specific clinical insights into the CBM training objective.
Abstract
Concept bottleneck models (CBMs), which predict human-interpretable concepts (e.g., nucleus shapes in cell images) before predicting the final output (e.g., cell type), provide insights into the decision-making processes of the model. However, training CBMs solely in a data-driven manner can introduce undesirable biases, which may compromise prediction performance, especially when the trained models are evaluated on out-of-domain images (e.g., those acquired using different devices). To mitigate this challenge, we propose integrating clinical knowledge to refine CBMs, better aligning them with clinicians' decision-making processes. Specifically, we guide the model to prioritize the concepts that clinicians also prioritize. We validate our approach on two datasets of medical images: white blood cell and skin images. Empirical validation demonstrates that incorporating medical guidance enhances the model's classification performance on unseen datasets with varying preparation methods, thereby increasing its real-world applicability.
