Beyond the Veil of Similarity: Quantifying Semantic Continuity in Explainable AI
Qi Huang, Emanuele Mezzi, Osman Mutlu, Miltiadis Kofinas, Vidya Prasad, Shadnan Azwad Khan, Elena Ranguelova, Niki van Stein
TL;DR
This paper tackles the challenge of interpretability by introducing semantic continuity as a quantitative criterion for XAI explanations: similar inputs should yield similar explanations. It defines a formal metric to measure this property and evaluates it in image classification using simple shape variations and a synthetic facial dataset, comparing RISE, LIME, GradCAM, and KernelSHAP explainers. The findings indicate GradCAM offers the strongest semantic continuity, KernelSHAP closely follows, while LIME tends to be less stable, with RISE showing middle-ground performance in several scenarios. The work provides a practical framework and metrics for assessing XAI explanations, enabling more reliable explainer selection and advancing transparent AI across domains.
Abstract
We introduce a novel metric for measuring semantic continuity in Explainable AI methods and machine learning models. We posit that for models to be truly interpretable and trustworthy, similar inputs should yield similar explanations, reflecting a consistent semantic understanding. By leveraging XAI techniques, we assess semantic continuity in the task of image recognition. We conduct experiments to observe how incremental changes in input affect the explanations provided by different XAI methods. Through this approach, we aim to evaluate the models' capability to generalize and abstract semantic concepts accurately and to evaluate different XAI methods in correctly capturing the model behaviour. This paper contributes to the broader discourse on AI interpretability by proposing a quantitative measure for semantic continuity for XAI methods, offering insights into the models' and explainers' internal reasoning processes, and promoting more reliable and transparent AI systems.
