In Defence of Post-hoc Explainability
Nick Oh
TL;DR
This paper defends post-hoc explainability as a legitimate epistemic tool for scientific ML by reframing the bridge from the true phenomenon $f(X)$ to model estimates $h^*(X)$ through post-hoc explanations $p^*(X)$, enabling falsifiable hypotheses when empirically validated. It develops a framework of mediated understanding and bounded factivity, integrating holistic representationality with theory-ladenness and empirical validation, and argues for structured, task-specific interpretability methods that link model behavior to real phenomena. The authors illustrate the framework with a case study on XAI analysis of pancreatic tissue in Type 2 diabetes, showing how post-hoc explanations can generate novel scientific hypotheses and contribute to mechanistic understanding under empirical scrutiny. They acknowledge limitations and outline directions for operationalizing bounds, expanding empirical validation, and developing criteria to assess epistemic validity in scientific ML, aiming to guide robust, practice-aligned XAI in science.
Abstract
This position paper defends post-hoc explainability methods as legitimate tools for scientific knowledge production in machine learning. Addressing criticism of these methods' reliability and epistemic status, we develop a philosophical framework grounded in mediated understanding and bounded factivity. We argue that scientific insights can emerge through structured interpretation of model behaviour without requiring complete mechanistic transparency, provided explanations acknowledge their approximative nature and undergo rigorous empirical validation. Through analysis of recent biomedical ML applications, we demonstrate how post-hoc methods, when properly integrated into scientific practice, generate novel hypotheses and advance phenomenal understanding.
