Explingo: Explaining AI Predictions using Large Language Models
Alexandra Zytek, Sara Pido, Sarah Alnegheimish, Laure Berti-Equille, Kalyan Veeramachaneni
TL;DR
The paper addresses the challenge of turning explainable AI outputs into human-readable narratives by introducing Explingo, a two-component system with a Narrator that converts SHAP explanations into narratives and a Grader that automatically evaluates narrative quality across accuracy, completeness, fluency, and conciseness. Results show that LLMs can generate high-quality narratives when guided by a small set of hand-written and bootstrapped exemplars, with the Grader providing automated, scalable evaluation via a weighted score $G = \alpha_a A + \alpha_f F + \alpha_c C + \alpha_s S$. The work provides an open-source implementation within Pyreal and nine exemplar datasets to support tuning and evaluation, highlighting the trade-offs between exemplar quantity and narrative fidelity. This approach enables safer, more usable narrative explanations and lays the groundwork for interactive, natural-language ML explanations in real-world decision-making.
Abstract
Explanations of machine learning (ML) model predictions generated by Explainable AI (XAI) techniques such as SHAP are essential for people using ML outputs for decision-making. We explore the potential of Large Language Models (LLMs) to transform these explanations into human-readable, narrative formats that align with natural communication. We address two key research questions: (1) Can LLMs reliably transform traditional explanations into high-quality narratives? and (2) How can we effectively evaluate the quality of narrative explanations? To answer these questions, we introduce Explingo, which consists of two LLM-based subsystems, a Narrator and Grader. The Narrator takes in ML explanations and transforms them into natural-language descriptions. The Grader scores these narratives on a set of metrics including accuracy, completeness, fluency, and conciseness. Our experiments demonstrate that LLMs can generate high-quality narratives that achieve high scores across all metrics, particularly when guided by a small number of human-labeled and bootstrapped examples. We also identified areas that remain challenging, in particular for effectively scoring narratives in complex domains. The findings from this work have been integrated into an open-source tool that makes narrative explanations available for further applications.
