Table of Contents
Fetching ...

A Quantitative Evaluation Framework for Explainable AI in Semantic Segmentation

Reem Hammoud, Abdul Karim Gizzini, Ali J. Ghandour

TL;DR

The paper presents a quantitative evaluation framework for explainable AI in semantic segmentation, addressing the gap where XAI assessment has been largely qualitative or classification-focused. It introduces four evaluation strategies (S1–S3 with GT and PM variants) and pixel-level metrics to measure how explanations influence segmentation decisions, validated through CAM-based XAI methods on a building-structure dataset. Results show Score-CAM generally provides the most reliable and precise explanations, while incorporating surrounding context around target regions improves predictive fidelity. The framework offers a general, robust approach to auditing interpretability in safety-critical segmentation tasks, with potential applicability to medical imaging and other domains requiring trustworthy AI explanations.

Abstract

Ensuring transparency and trust in artificial intelligence (AI) models is essential as they are increasingly deployed in safety-critical and high-stakes domains. Explainable AI (XAI) has emerged as a promising approach to address this challenge; however, the rigorous evaluation of XAI methods remains vital for balancing the trade-offs between model complexity, predictive performance, and interpretability. While substantial progress has been made in evaluating XAI for classification tasks, strategies tailored to semantic segmentation remain limited. Moreover, objectively assessing XAI approaches is difficult, since qualitative visual explanations provide only preliminary insights. Such qualitative methods are inherently subjective and cannot ensure the accuracy or stability of explanations. To address these limitations, this work introduces a comprehensive quantitative evaluation framework for assessing XAI in semantic segmentation, accounting for both spatial and contextual task complexities. The framework systematically integrates pixel-level evaluation strategies with carefully designed metrics to yield fine-grained interpretability insights. Simulation results using recently adapted class activation mapping (CAM)-based XAI schemes demonstrate the efficiency, robustness, and reliability of the proposed methodology. These findings advance the development of transparent, trustworthy, and accountable semantic segmentation models.

A Quantitative Evaluation Framework for Explainable AI in Semantic Segmentation

TL;DR

The paper presents a quantitative evaluation framework for explainable AI in semantic segmentation, addressing the gap where XAI assessment has been largely qualitative or classification-focused. It introduces four evaluation strategies (S1–S3 with GT and PM variants) and pixel-level metrics to measure how explanations influence segmentation decisions, validated through CAM-based XAI methods on a building-structure dataset. Results show Score-CAM generally provides the most reliable and precise explanations, while incorporating surrounding context around target regions improves predictive fidelity. The framework offers a general, robust approach to auditing interpretability in safety-critical segmentation tasks, with potential applicability to medical imaging and other domains requiring trustworthy AI explanations.

Abstract

Ensuring transparency and trust in artificial intelligence (AI) models is essential as they are increasingly deployed in safety-critical and high-stakes domains. Explainable AI (XAI) has emerged as a promising approach to address this challenge; however, the rigorous evaluation of XAI methods remains vital for balancing the trade-offs between model complexity, predictive performance, and interpretability. While substantial progress has been made in evaluating XAI for classification tasks, strategies tailored to semantic segmentation remain limited. Moreover, objectively assessing XAI approaches is difficult, since qualitative visual explanations provide only preliminary insights. Such qualitative methods are inherently subjective and cannot ensure the accuracy or stability of explanations. To address these limitations, this work introduces a comprehensive quantitative evaluation framework for assessing XAI in semantic segmentation, accounting for both spatial and contextual task complexities. The framework systematically integrates pixel-level evaluation strategies with carefully designed metrics to yield fine-grained interpretability insights. Simulation results using recently adapted class activation mapping (CAM)-based XAI schemes demonstrate the efficiency, robustness, and reliability of the proposed methodology. These findings advance the development of transparent, trustworthy, and accountable semantic segmentation models.

Paper Structure

This paper contains 9 sections, 1 figure, 8 tables.

Figures (1)

  • Figure 1: Proposed XAI Evaluation Strategies.