Scientific Inference With Interpretable Machine Learning: Analyzing Models to Learn About Real-World Phenomena

Timo Freiesleben; Gunnar König; Christoph Molnar; Alvaro Tejero-Cantero

Scientific Inference With Interpretable Machine Learning: Analyzing Models to Learn About Real-World Phenomena

Timo Freiesleben, Gunnar König, Christoph Molnar, Alvaro Tejero-Cantero

TL;DR

The paper addresses the challenge of deriving scientifically meaningful inferences from predictive yet opaque machine learning models. It introduces holistic representationality (HR) and a four-step framework of property descriptors that map whole-model behavior to properties of the phenomenon, grounded in the conditional distribution $\mathbb{P}(Y|\boldsymbol{X})$ under i.i.d. data. It surveys existing IML methods (e.g., cFI, SAGE, PRIM, ICE, cSV/ICI, counterfactuals) as potential property descriptors and provides guidance on uncertainty quantification and practical estimation from finite data. It also clarifies the limits of causal inference from descriptors alone, emphasizing that causal conclusions require additional assumptions or interventional data, and outlines data-generation and data-collection considerations for robust scientific use. The work offers a principled pathway for scientists to leverage HR ML models for inference, along with a roadmap for future research and tool development to enable realistic data, uncertainty-aware descriptors, and integration with causality-focused methods.

Abstract

To learn about real world phenomena, scientists have traditionally used models with clearly interpretable elements. However, modern machine learning (ML) models, while powerful predictors, lack this direct elementwise interpretability (e.g. neural network weights). Interpretable machine learning (IML) offers a solution by analyzing models holistically to derive interpretations. Yet, current IML research is focused on auditing ML models rather than leveraging them for scientific inference. Our work bridges this gap, presenting a framework for designing IML methods-termed 'property descriptors' -- that illuminate not just the model, but also the phenomenon it represents. We demonstrate that property descriptors, grounded in statistical learning theory, can effectively reveal relevant properties of the joint probability distribution of the observational data. We identify existing IML methods suited for scientific inference and provide a guide for developing new descriptors with quantified epistemic uncertainty. Our framework empowers scientists to harness ML models for inference, and provides directions for future IML research to support scientific understanding.

Scientific Inference With Interpretable Machine Learning: Analyzing Models to Learn About Real-World Phenomena

TL;DR

under i.i.d. data. It surveys existing IML methods (e.g., cFI, SAGE, PRIM, ICE, cSV/ICI, counterfactuals) as potential property descriptors and provides guidance on uncertainty quantification and practical estimation from finite data. It also clarifies the limits of causal inference from descriptors alone, emphasizing that causal conclusions require additional assumptions or interventional data, and outlines data-generation and data-collection considerations for robust scientific use. The work offers a principled pathway for scientists to leverage HR ML models for inference, along with a roadmap for future research and tool development to enable realistic data, uncertainty-aware descriptors, and integration with causality-focused methods.

Abstract

Paper Structure (39 sections, 13 equations, 9 figures, 2 tables)

This paper contains 39 sections, 13 equations, 9 figures, 2 tables.

Introduction
Contributions
Roadmap
Terminology
Related work
Philosophy of science
Statistical modeling and machine learning
Causal inference using machine learning
Interpretable machine learning
The traditional approach to scientific inference requires model elements that meaningfully represent
Example associational ER model: simple linear regression
The elements of ML models do not meaningfully represent
Example ML associational model: artificial neural network (ANN)
But do ML model elements really not represent?
IML analyzes the model as a whole, but does it allow for scientific inference?
...and 24 more sections

Figures (9)

Figure 1: Model and phenomenon sustain an encoding-decoding relationship. The main elements of a traditional, ER model, are shown in encoding-decoding correspondence to the phenomenon elements they represent stachowiak1973allgemeinecontessa2007scientific. Phenomenon and model elements are illustrated with a simple example of two bodies in gravitational interaction and its classical, Newtonian mechanistic description. This physical example was chosen to illustrate the ER paradigm, we make no claim that our property descriptors presented later will achieve similar representational power.
Figure 2: ML models are generally not ER. Three input images synthesized to maximally activate a given unit in a neural network olah2020zoom illustrate how "concepts" as different as cat faces, fronts of cars, or cat legs all elicit strong responses, suggesting neural network elements generally do not represent unique concepts mu2020compositionalnguyen2016multifaceted.
Figure 3: Property descriptions distill phenomenon properties from HR models. Instead of explicitly encoding phenomenon properties as parameters like for ER models, HR models (e.g. ML models) encode phenomenon properties in the whole model. We propose that these encoded properties can be read out with property descriptions external to the model. Property descriptors can take on the inferential role of parameters, for example of coefficients in linear models.
Figure 4: An epistemic foundation for scientific inference with IML. Steps 1 and 2 connect the property descriptors theoretically with an underlying estimand of scientific interest. Steps 3 and 4 show how to practically draw inferences and quantify their uncertainty. See text for symbol definitions.
Figure 5: (a) shows the cPDP estimate of $\mathbb{E}_{Y\,|\, X_p}[Y\,|\,X_p]$ via \ref{['eq:approxDescriptor']}. Note that the grade jittering strategy described in the text allows us to evaluate e.g. $\hat{m}(x_p=3)$ even though we have no data for that value. (b) shows the histogram of grades in Portuguese in the original dataset cortez2008using.
...and 4 more figures

Theorems & Definitions (1)

Definition 1

Scientific Inference With Interpretable Machine Learning: Analyzing Models to Learn About Real-World Phenomena

TL;DR

Abstract

Scientific Inference With Interpretable Machine Learning: Analyzing Models to Learn About Real-World Phenomena

Authors

TL;DR

Abstract

Table of Contents

Figures (9)

Theorems & Definitions (1)