Table of Contents
Fetching ...

In Defence of Post-hoc Explainability

Nick Oh

TL;DR

This paper defends post-hoc explainability as a legitimate epistemic tool for scientific ML by reframing the bridge from the true phenomenon $f(X)$ to model estimates $h^*(X)$ through post-hoc explanations $p^*(X)$, enabling falsifiable hypotheses when empirically validated. It develops a framework of mediated understanding and bounded factivity, integrating holistic representationality with theory-ladenness and empirical validation, and argues for structured, task-specific interpretability methods that link model behavior to real phenomena. The authors illustrate the framework with a case study on XAI analysis of pancreatic tissue in Type 2 diabetes, showing how post-hoc explanations can generate novel scientific hypotheses and contribute to mechanistic understanding under empirical scrutiny. They acknowledge limitations and outline directions for operationalizing bounds, expanding empirical validation, and developing criteria to assess epistemic validity in scientific ML, aiming to guide robust, practice-aligned XAI in science.

Abstract

This position paper defends post-hoc explainability methods as legitimate tools for scientific knowledge production in machine learning. Addressing criticism of these methods' reliability and epistemic status, we develop a philosophical framework grounded in mediated understanding and bounded factivity. We argue that scientific insights can emerge through structured interpretation of model behaviour without requiring complete mechanistic transparency, provided explanations acknowledge their approximative nature and undergo rigorous empirical validation. Through analysis of recent biomedical ML applications, we demonstrate how post-hoc methods, when properly integrated into scientific practice, generate novel hypotheses and advance phenomenal understanding.

In Defence of Post-hoc Explainability

TL;DR

This paper defends post-hoc explainability as a legitimate epistemic tool for scientific ML by reframing the bridge from the true phenomenon to model estimates through post-hoc explanations , enabling falsifiable hypotheses when empirically validated. It develops a framework of mediated understanding and bounded factivity, integrating holistic representationality with theory-ladenness and empirical validation, and argues for structured, task-specific interpretability methods that link model behavior to real phenomena. The authors illustrate the framework with a case study on XAI analysis of pancreatic tissue in Type 2 diabetes, showing how post-hoc explanations can generate novel scientific hypotheses and contribute to mechanistic understanding under empirical scrutiny. They acknowledge limitations and outline directions for operationalizing bounds, expanding empirical validation, and developing criteria to assess epistemic validity in scientific ML, aiming to guide robust, practice-aligned XAI in science.

Abstract

This position paper defends post-hoc explainability methods as legitimate tools for scientific knowledge production in machine learning. Addressing criticism of these methods' reliability and epistemic status, we develop a philosophical framework grounded in mediated understanding and bounded factivity. We argue that scientific insights can emerge through structured interpretation of model behaviour without requiring complete mechanistic transparency, provided explanations acknowledge their approximative nature and undergo rigorous empirical validation. Through analysis of recent biomedical ML applications, we demonstrate how post-hoc methods, when properly integrated into scientific practice, generate novel hypotheses and advance phenomenal understanding.

Paper Structure

This paper contains 26 sections, 2 figures.

Figures (2)

  • Figure 1: Comparison of intrinsic and post-hoc interpretability models.
  • Figure 2: Framework for Scientific Knowledge Generation through Post-hoc Methods. The diagram illustrates the cyclic process of generating scientific understanding from ML models through post-hoc interpretability methods. Each component shows both the theoretical principle (dark grey boxes) and its practical application in T2D research (light grey boxes). The process begins with translation of model behaviour, progresses through method selection and hypothesis generation, and culminates in empirical validation through instrumental and theoretical refinement. Green elements represent the key stages in the knowledge generation pipeline. Arrows indicate the flow and interactions between components, demonstrating how post-hoc methods mediate between model behaviour and scientific understanding.