Towards Explainable Artificial Intelligence (XAI): A Data Mining Perspective

Haoyi Xiong; Xuhong Li; Xiaofei Zhang; Jiamin Chen; Xinhao Sun; Yuchen Li; Zeyi Sun; Mengnan Du

Towards Explainable Artificial Intelligence (XAI): A Data Mining Perspective

Haoyi Xiong, Xuhong Li, Xiaofei Zhang, Jiamin Chen, Xinhao Sun, Yuchen Li, Zeyi Sun, Mengnan Du

TL;DR

This paper reframes explainable AI (XAI) through a data-mining lens, organizing XAI methods by three purposes (interpretations of models, influences of training data, and domain-oriented insights) and mapping them onto a four-stage data-mining workflow (data acquisition, preparation, modeling, results reporting). It provides a comprehensive taxonomy of techniques across modalities (images, text, tabular) and data artifacts (training data, logs, prototypes, activations), detailing concrete methods such as LIME, SHAP, influence functions, TracIn, ProtoPNet, TCAV, and counterfactuals. It also discusses data valuation and anomaly detection as pivotal lenses for understanding model decisions, and highlights societal and scientific applications of XAI, including fairness, ethics, accountability, and interdisciplinary discovery. The paper identifies key limitations (data quality, scaling, evaluation frameworks) and outlines future directions toward scalable, trustworthy, and human-centered AI grounded in data-centric explainability.

Abstract

Given the complexity and lack of transparency in deep neural networks (DNNs), extensive efforts have been made to make these systems more interpretable or explain their behaviors in accessible terms. Unlike most reviews, which focus on algorithmic and model-centric perspectives, this work takes a "data-centric" view, examining how data collection, processing, and analysis contribute to explainable AI (XAI). We categorize existing work into three categories subject to their purposes: interpretations of deep models, referring to feature attributions and reasoning processes that correlate data points with model outputs; influences of training data, examining the impact of training data nuances, such as data valuation and sample anomalies, on decision-making processes; and insights of domain knowledge, discovering latent patterns and fostering new knowledge from data and models to advance social values and scientific discovery. Specifically, we distill XAI methodologies into data mining operations on training and testing data across modalities, such as images, text, and tabular data, as well as on training logs, checkpoints, models and other DNN behavior descriptors. In this way, our study offers a comprehensive, data-centric examination of XAI from a lens of data mining methods and applications.

Towards Explainable Artificial Intelligence (XAI): A Data Mining Perspective

TL;DR

Abstract

Paper Structure (54 sections, 8 figures, 5 tables)

This paper contains 54 sections, 8 figures, 5 tables.

Introduction
Interpretations: Feature attributions and Reasoning Processes of Deep Models
Feature Attributions as Model Explanation
Perturbation-based Methods
Differentiation-based Methods
Activation/Attention-based Methods
Proxy Explainable Models
Reasoning Process as Model Explanation
Visualizing Intermediate Representations
Visualizing the Logic of Reasoning
Counterfactual Examples as Decision Rules
Prototypes as Decision Rules
Concept Activation Vectors and Derivatives
Summary and Discussion
Data Acquisition and Collection
...and 39 more sections

Figures (8)

Figure 1: Overview of Explainable AI as a Data Mining Approach for Interpretations, Influences and Insights
Figure 2: Taxonomy of research in Explainable Artificial Intelligence (XAI) from a Data Mining Perspectives: Interpretation of Deep Models, Influences of Training Samples, and Insights of Domain Knowledge.
Figure 3: Visualization of Commonly-used Feature Attribution Methods with Vision and NLP Models: (a)--(d) the ViT-base model and derivatives fine-tuned for birds classification wah2011caltech; (e) a BERT model fine-tuned on IMDb movie reviews maas-EtAl:2011:ACL-HLT2011.
Figure 4: An Example of Proxy Explainable Models with Global and Local Surrogates for Global and Local Interpretations
Figure 5: Visualizing feature importance and logic of reasoning with tree/forest-based surrogates
...and 3 more figures

Towards Explainable Artificial Intelligence (XAI): A Data Mining Perspective

TL;DR

Abstract

Towards Explainable Artificial Intelligence (XAI): A Data Mining Perspective

Authors

TL;DR

Abstract

Table of Contents

Figures (8)