XplainLLM: A Knowledge-Augmented Dataset for Reliable Grounded Explanations in LLMs
Zichen Chen, Jianda Chen, Ambuj Singh, Misha Sra
TL;DR
XplainLLM tackles the challenge of transparent reasoning in LLMs by introducing a knowledge-augmented dataset and a retrieval-based grounding framework that anchors explanations in knowledge graphs and graph attention networks. The approach uses two model types (Llama-3-8B and RoBERTa-large) on CommonsenseQA to generate explanations, and introduces a debugger-score for multi-dimensional quality assessment. Empirical results show that XplainLLM improves grounded explanations and reduces hallucinations, with high human and automated scores and broad improvements across diverse LLMs. This resource enables researchers and practitioners to verify, trust, and debug LLM outputs in critical domains.
Abstract
Large Language Models (LLMs) have achieved remarkable success in natural language tasks, yet understanding their reasoning processes remains a significant challenge. We address this by introducing XplainLLM, a dataset accompanying an explanation framework designed to enhance LLM transparency and reliability. Our dataset comprises 24,204 instances where each instance interprets the LLM's reasoning behavior using knowledge graphs (KGs) and graph attention networks (GAT), and includes explanations of LLMs such as the decoder-only Llama-3 and the encoder-only RoBERTa. XplainLLM also features a framework for generating grounded explanations and the debugger-scores for multidimensional quality analysis. Our explanations include why-choose and why-not-choose components, reason-elements, and debugger-scores that collectively illuminate the LLM's reasoning behavior. Our evaluations demonstrate XplainLLM's potential to reduce hallucinations and improve grounded explanation generation in LLMs. XplainLLM is a resource for researchers and practitioners to build trust and verify the reliability of LLM outputs.
