CoPA: Hierarchical Concept Prompting and Aggregating Network for Explainable Diagnosis

Yiheng Dong; Yi Lin; Xin Yang

CoPA: Hierarchical Concept Prompting and Aggregating Network for Explainable Diagnosis

Yiheng Dong, Yi Lin, Xin Yang

TL;DR

CoPA addresses the interpretability gap in medical image diagnosis by capturing fine-grained, multiscale concepts across multiple encoder layers. It introduces the Concept-aware Embedding Generator (CEG) to distill layer-wise concept representations and Concept Prompt Tuning (CPT) to guide feature extraction without degrading the pretrained backbone, followed by a gated aggregation and contrastive cross-modal alignment with textual concepts. The approach yields state-of-the-art results on three dermoscopy/clinical datasets and provides faithful, understandable, and plausible explanations through concept heatmaps and a transparent prediction workflow. The framework demonstrates that hierarchical concept prompting and aggregation can enhance both diagnostic accuracy and interpretability in medical imaging, with practical implications for clinical deployment.

Abstract

The transparency of deep learning models is essential for clinical diagnostics. Concept Bottleneck Model provides clear decision-making processes for diagnosis by transforming the latent space of black-box models into human-understandable concepts. However, concept-based methods still face challenges in concept capture capabilities. These methods often rely on encode features solely from the final layer, neglecting shallow and multiscale features, and lack effective guidance in concept encoding, hindering fine-grained concept extraction. To address these issues, we introduce Concept Prompting and Aggregating (CoPA), a novel framework designed to capture multilayer concepts under prompt guidance. This framework utilizes the Concept-aware Embedding Generator (CEG) to extract concept representations from each layer of the visual encoder. Simultaneously, these representations serve as prompts for Concept Prompt Tuning (CPT), steering the model towards amplifying critical concept-related visual cues. Visual representations from each layer are aggregated to align with textual concept representations. With the proposed method, valuable concept-wise information in the images is captured and utilized effectively, thus improving the performance of concept and disease prediction. Extensive experimental results demonstrate that CoPA outperforms state-of-the-art methods on three public datasets. Code is available at https://github.com/yihengd/CoPA.

CoPA: Hierarchical Concept Prompting and Aggregating Network for Explainable Diagnosis

TL;DR

Abstract

CoPA: Hierarchical Concept Prompting and Aggregating Network for Explainable Diagnosis

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)