Beyond single-model XAI: aggregating multi-model explanations for enhanced trustworthiness

Ilaria Vascotto; Alex Rodriguez; Alessandro Bonaita; Luca Bortolussi

Beyond single-model XAI: aggregating multi-model explanations for enhanced trustworthiness

Ilaria Vascotto, Alex Rodriguez, Alessandro Bonaita, Luca Bortolussi

TL;DR

Trustworthy AI requires robust explanations that withstand input variations and model disagreements. This work extends XAI by deriving kNN and RF-specific feature attributions, combining them with NN explanations via DeepLIFT, and aggregating across models to form a single, robust explanation. A local robustness estimator based on on-manifold neighbourhood perturbations assesses stability, while a medoid-based perturbation scheme preserves the data distribution and model predictions. Across five binary tabular datasets, the aggregation provides a conservative yet informative explanation, with NN explanations typically more robust than kNN, showcasing the potential of multi-model explanation aggregation to enhance trust in high-stakes settings.

Abstract

The use of Artificial Intelligence (AI) models in real-world and high-risk applications has intensified the discussion about their trustworthiness and ethical usage, from both a technical and a legislative perspective. The field of eXplainable Artificial Intelligence (XAI) addresses this challenge by proposing explanations that bring to light the decision-making processes of complex black-box models. Despite being an essential property, the robustness of explanations is often an overlooked aspect during development: only robust explanation methods can increase the trust in the system as a whole. This paper investigates the role of robustness through the usage of a feature importance aggregation derived from multiple models ($k$-nearest neighbours, random forest and neural networks). Preliminary results showcase the potential in increasing the trustworthiness of the application, while leveraging multiple model's predictive power.

Beyond single-model XAI: aggregating multi-model explanations for enhanced trustworthiness

TL;DR

Abstract

Beyond single-model XAI: aggregating multi-model explanations for enhanced trustworthiness

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)