XplainLLM: A Knowledge-Augmented Dataset for Reliable Grounded Explanations in LLMs

Zichen Chen; Jianda Chen; Ambuj Singh; Misha Sra

XplainLLM: A Knowledge-Augmented Dataset for Reliable Grounded Explanations in LLMs

Zichen Chen, Jianda Chen, Ambuj Singh, Misha Sra

TL;DR

XplainLLM tackles the challenge of transparent reasoning in LLMs by introducing a knowledge-augmented dataset and a retrieval-based grounding framework that anchors explanations in knowledge graphs and graph attention networks. The approach uses two model types (Llama-3-8B and RoBERTa-large) on CommonsenseQA to generate explanations, and introduces a debugger-score for multi-dimensional quality assessment. Empirical results show that XplainLLM improves grounded explanations and reduces hallucinations, with high human and automated scores and broad improvements across diverse LLMs. This resource enables researchers and practitioners to verify, trust, and debug LLM outputs in critical domains.

Abstract

Large Language Models (LLMs) have achieved remarkable success in natural language tasks, yet understanding their reasoning processes remains a significant challenge. We address this by introducing XplainLLM, a dataset accompanying an explanation framework designed to enhance LLM transparency and reliability. Our dataset comprises 24,204 instances where each instance interprets the LLM's reasoning behavior using knowledge graphs (KGs) and graph attention networks (GAT), and includes explanations of LLMs such as the decoder-only Llama-3 and the encoder-only RoBERTa. XplainLLM also features a framework for generating grounded explanations and the debugger-scores for multidimensional quality analysis. Our explanations include why-choose and why-not-choose components, reason-elements, and debugger-scores that collectively illuminate the LLM's reasoning behavior. Our evaluations demonstrate XplainLLM's potential to reduce hallucinations and improve grounded explanation generation in LLMs. XplainLLM is a resource for researchers and practitioners to build trust and verify the reliability of LLM outputs.

XplainLLM: A Knowledge-Augmented Dataset for Reliable Grounded Explanations in LLMs

TL;DR

Abstract

Paper Structure (39 sections, 8 equations, 8 figures, 5 tables, 1 algorithm)

This paper contains 39 sections, 8 equations, 8 figures, 5 tables, 1 algorithm.

Introduction
Related Work
Interpretability in LLMs
Explanation Datasets
XplainLLM: Dataset, Explanation Framework and Debugger-Score
Task Definition and Collection Method
Graph-Based Reasoning Interpretation.
Controlled Explanation Generation.
Explanation Framework for Grounded Explanations
Embedding Calculation.
Similarity Computation and Retrieval.
Instance Selection and Explanation Generation.
Debugger-Score for Explanation Analysis
Dataset Overview and Preparation
Dataset Description
...and 24 more sections

Figures (8)

Figure 1: Overview of XplainLLM in LLM Reasoning Interpretation and Explanation Generation.
Figure 2: Explanation Framework for Grounded Explanation Generation in LLMs.
Figure 3: Evaluation by human experts, automated evaluator GPT-3.5 and GPT-4.
Figure 4: Human evaluation of explanations: Overall, CP, and IP. Note that the CP scores align closely with the overall scores.
Figure 5: Accuracy comparison of vanilla version and with XplainLLM version for different models.
...and 3 more figures

XplainLLM: A Knowledge-Augmented Dataset for Reliable Grounded Explanations in LLMs

TL;DR

Abstract

XplainLLM: A Knowledge-Augmented Dataset for Reliable Grounded Explanations in LLMs

Authors

TL;DR

Abstract

Table of Contents

Figures (8)