Data Science with LLMs and Interpretable Models
Sebastian Bordt, Ben Lengerich, Harsha Nori, Rich Caruana
TL;DR
This work tackles enabling transparent data science by fusing large language models with interpretable Generalized Additive Models. The method trains GAMs, textualizes their graphs as JSON, and uses chain-of-thought prompting to elicit explanations from LLMs, producing dataset- and model-level summaries, while supporting hypothesis generation. Experiments show GPT-4 can reliably perform baseline graph tasks, generate coherent qualitative descriptions, and detect anomalies, with grounded responses in many cases but still vulnerable to hallucinations under certain prompts. The authors provide an open-source LLM-GAM interface and discuss practical implications for domain experts, along with limitations and directions for future improvement in grounding and evaluation with more complex graphs.
Abstract
Recent years have seen important advances in the building of interpretable models, machine learning models that are designed to be easily understood by humans. In this work, we show that large language models (LLMs) are remarkably good at working with interpretable models, too. In particular, we show that LLMs can describe, interpret, and debug Generalized Additive Models (GAMs). Combining the flexibility of LLMs with the breadth of statistical patterns accurately described by GAMs enables dataset summarization, question answering, and model critique. LLMs can also improve the interaction between domain experts and interpretable models, and generate hypotheses about the underlying phenomenon. We release \url{https://github.com/interpretml/TalkToEBM} as an open-source LLM-GAM interface.
