Scientific Inference With Interpretable Machine Learning: Analyzing Models to Learn About Real-World Phenomena
Timo Freiesleben, Gunnar König, Christoph Molnar, Alvaro Tejero-Cantero
TL;DR
The paper addresses the challenge of deriving scientifically meaningful inferences from predictive yet opaque machine learning models. It introduces holistic representationality (HR) and a four-step framework of property descriptors that map whole-model behavior to properties of the phenomenon, grounded in the conditional distribution $\mathbb{P}(Y|\boldsymbol{X})$ under i.i.d. data. It surveys existing IML methods (e.g., cFI, SAGE, PRIM, ICE, cSV/ICI, counterfactuals) as potential property descriptors and provides guidance on uncertainty quantification and practical estimation from finite data. It also clarifies the limits of causal inference from descriptors alone, emphasizing that causal conclusions require additional assumptions or interventional data, and outlines data-generation and data-collection considerations for robust scientific use. The work offers a principled pathway for scientists to leverage HR ML models for inference, along with a roadmap for future research and tool development to enable realistic data, uncertainty-aware descriptors, and integration with causality-focused methods.
Abstract
To learn about real world phenomena, scientists have traditionally used models with clearly interpretable elements. However, modern machine learning (ML) models, while powerful predictors, lack this direct elementwise interpretability (e.g. neural network weights). Interpretable machine learning (IML) offers a solution by analyzing models holistically to derive interpretations. Yet, current IML research is focused on auditing ML models rather than leveraging them for scientific inference. Our work bridges this gap, presenting a framework for designing IML methods-termed 'property descriptors' -- that illuminate not just the model, but also the phenomenon it represents. We demonstrate that property descriptors, grounded in statistical learning theory, can effectively reveal relevant properties of the joint probability distribution of the observational data. We identify existing IML methods suited for scientific inference and provide a guide for developing new descriptors with quantified epistemic uncertainty. Our framework empowers scientists to harness ML models for inference, and provides directions for future IML research to support scientific understanding.
