Table of Contents
Fetching ...

Explainable artificial intelligence (XAI): from inherent explainability to large language models

Fuseini Mumuni, Alhassan Mumuni

TL;DR

This survey maps the Explainable AI landscape from inherently interpretable white-box models to explanations of black-box systems, with a dedicated focus on how large language models and vision-language models augment explainability. It dissects methodological families (gradient-based, CAM, SHAP/LIME, attention, counterfactuals, case-based and concept-based reasoning) and highlights faithfulness, evaluation challenges, and domain-specific tradeoffs. A substantial portion covers LLM and VLM-based approaches for local and global explanations, including prompting strategies, retrieval augmentation, mechanistic interpretability, and concept discovery. The paper also surveys recent advances in using LLMs to translate and justify explanations, and discusses open issues, benchmarking gaps, and future directions to harmonize explainability with scale and reliability in real-world AI systems.

Abstract

Artificial Intelligence (AI) has continued to achieve tremendous success in recent times. However, the decision logic of these frameworks is often not transparent, making it difficult for stakeholders to understand, interpret or explain their behavior. This limitation hinders trust in machine learning systems and causes a general reluctance towards their adoption in practical applications, particularly in mission-critical domains like healthcare and autonomous driving. Explainable AI (XAI) techniques facilitate the explainability or interpretability of machine learning models, enabling users to discern the basis of the decision and possibly avert undesirable behavior. This comprehensive survey details the advancements of explainable AI methods, from inherently interpretable models to modern approaches for achieving interpretability of various black box models, including large language models (LLMs). Additionally, we review explainable AI techniques that leverage LLM and vision-language model (VLM) frameworks to automate or improve the explainability of other machine learning models. The use of LLM and VLM as interpretability methods particularly enables high-level, semantically meaningful explanations of model decisions and behavior. Throughout the paper, we highlight the scientific principles, strengths and weaknesses of state-of-the-art methods and outline different areas of improvement. Where appropriate, we also present qualitative and quantitative comparison results of various methods to show how they compare. Finally, we discuss the key challenges of XAI and directions for future research.

Explainable artificial intelligence (XAI): from inherent explainability to large language models

TL;DR

This survey maps the Explainable AI landscape from inherently interpretable white-box models to explanations of black-box systems, with a dedicated focus on how large language models and vision-language models augment explainability. It dissects methodological families (gradient-based, CAM, SHAP/LIME, attention, counterfactuals, case-based and concept-based reasoning) and highlights faithfulness, evaluation challenges, and domain-specific tradeoffs. A substantial portion covers LLM and VLM-based approaches for local and global explanations, including prompting strategies, retrieval augmentation, mechanistic interpretability, and concept discovery. The paper also surveys recent advances in using LLMs to translate and justify explanations, and discusses open issues, benchmarking gaps, and future directions to harmonize explainability with scale and reliability in real-world AI systems.

Abstract

Artificial Intelligence (AI) has continued to achieve tremendous success in recent times. However, the decision logic of these frameworks is often not transparent, making it difficult for stakeholders to understand, interpret or explain their behavior. This limitation hinders trust in machine learning systems and causes a general reluctance towards their adoption in practical applications, particularly in mission-critical domains like healthcare and autonomous driving. Explainable AI (XAI) techniques facilitate the explainability or interpretability of machine learning models, enabling users to discern the basis of the decision and possibly avert undesirable behavior. This comprehensive survey details the advancements of explainable AI methods, from inherently interpretable models to modern approaches for achieving interpretability of various black box models, including large language models (LLMs). Additionally, we review explainable AI techniques that leverage LLM and vision-language model (VLM) frameworks to automate or improve the explainability of other machine learning models. The use of LLM and VLM as interpretability methods particularly enables high-level, semantically meaningful explanations of model decisions and behavior. Throughout the paper, we highlight the scientific principles, strengths and weaknesses of state-of-the-art methods and outline different areas of improvement. Where appropriate, we also present qualitative and quantitative comparison results of various methods to show how they compare. Finally, we discuss the key challenges of XAI and directions for future research.
Paper Structure (65 sections, 2 equations, 23 figures, 5 tables)

This paper contains 65 sections, 2 equations, 23 figures, 5 tables.

Figures (23)

  • Figure 1: Illustration of (a) white-box model, (b) and (c) black-box model explainable by post-hoc and ante-hoc methods, respectively. The circled numbers illustrate the sequence of operations from input to prediction and explanations.
  • Figure 2: Accuracy-interpretability relationship for different families of machine learning models. Although the illustration captures simple rule-based and linear models at the lower end of the accuracy spectrum, it should however, be noted that this assumes that the models are applied to complex problems. For simpler problems, these simple models achieve competitive results.
  • Figure 3: Outline of paper.
  • Figure 4: Generalized additive model composed of smoothed linear and non-linear functions.
  • Figure 5: A simple representation of a decision tree showing hierarchical rule-based structure.
  • ...and 18 more figures