Explainable artificial intelligence (XAI): from inherent explainability to large language models
Fuseini Mumuni, Alhassan Mumuni
TL;DR
This survey maps the Explainable AI landscape from inherently interpretable white-box models to explanations of black-box systems, with a dedicated focus on how large language models and vision-language models augment explainability. It dissects methodological families (gradient-based, CAM, SHAP/LIME, attention, counterfactuals, case-based and concept-based reasoning) and highlights faithfulness, evaluation challenges, and domain-specific tradeoffs. A substantial portion covers LLM and VLM-based approaches for local and global explanations, including prompting strategies, retrieval augmentation, mechanistic interpretability, and concept discovery. The paper also surveys recent advances in using LLMs to translate and justify explanations, and discusses open issues, benchmarking gaps, and future directions to harmonize explainability with scale and reliability in real-world AI systems.
Abstract
Artificial Intelligence (AI) has continued to achieve tremendous success in recent times. However, the decision logic of these frameworks is often not transparent, making it difficult for stakeholders to understand, interpret or explain their behavior. This limitation hinders trust in machine learning systems and causes a general reluctance towards their adoption in practical applications, particularly in mission-critical domains like healthcare and autonomous driving. Explainable AI (XAI) techniques facilitate the explainability or interpretability of machine learning models, enabling users to discern the basis of the decision and possibly avert undesirable behavior. This comprehensive survey details the advancements of explainable AI methods, from inherently interpretable models to modern approaches for achieving interpretability of various black box models, including large language models (LLMs). Additionally, we review explainable AI techniques that leverage LLM and vision-language model (VLM) frameworks to automate or improve the explainability of other machine learning models. The use of LLM and VLM as interpretability methods particularly enables high-level, semantically meaningful explanations of model decisions and behavior. Throughout the paper, we highlight the scientific principles, strengths and weaknesses of state-of-the-art methods and outline different areas of improvement. Where appropriate, we also present qualitative and quantitative comparison results of various methods to show how they compare. Finally, we discuss the key challenges of XAI and directions for future research.
