Table of Contents
Fetching ...

ML Interpretability: Simple Isn't Easy

Tim Räz

TL;DR

The paper argues that ML interpretability is not a single, monolithic notion but a graded property that depends on how a predictor f is represented and understood. It reframes interpretability as functional interpretability—understanding the input-output behavior of f—and analyzes four interpretable model families (linear models, CART, MARS, GAMs) to show how interpretability arises and evolves with generality. It identifies four dimensions that influence interpretability and reveals two distinct paradigms (linear and tree-based) plus a middle ground via MARS, with GAMs illustrating additive nonparametric components. The findings offer a nuanced framework for explaining predictor functions globally and have implications for xAI and black-box models, suggesting a combined approach that leverages both formal representations and visualization, while outlining directions to extend the analysis to other ML paradigms.

Abstract

The interpretability of ML models is important, but it is not clear what it amounts to. So far, most philosophers have discussed the lack of interpretability of black-box models such as neural networks, and methods such as explainable AI that aim to make these models more transparent. The goal of this paper is to clarify the nature of interpretability by focussing on the other end of the 'interpretability spectrum'. The reasons why some models, linear models and decision trees, are highly interpretable will be examined, and also how more general models, MARS and GAM, retain some degree of interpretability. I find that while there is heterogeneity in how we gain interpretability, what interpretability is in particular cases can be explicated in a clear manner.

ML Interpretability: Simple Isn't Easy

TL;DR

The paper argues that ML interpretability is not a single, monolithic notion but a graded property that depends on how a predictor f is represented and understood. It reframes interpretability as functional interpretability—understanding the input-output behavior of f—and analyzes four interpretable model families (linear models, CART, MARS, GAMs) to show how interpretability arises and evolves with generality. It identifies four dimensions that influence interpretability and reveals two distinct paradigms (linear and tree-based) plus a middle ground via MARS, with GAMs illustrating additive nonparametric components. The findings offer a nuanced framework for explaining predictor functions globally and have implications for xAI and black-box models, suggesting a combined approach that leverages both formal representations and visualization, while outlining directions to extend the analysis to other ML paradigms.

Abstract

The interpretability of ML models is important, but it is not clear what it amounts to. So far, most philosophers have discussed the lack of interpretability of black-box models such as neural networks, and methods such as explainable AI that aim to make these models more transparent. The goal of this paper is to clarify the nature of interpretability by focussing on the other end of the 'interpretability spectrum'. The reasons why some models, linear models and decision trees, are highly interpretable will be examined, and also how more general models, MARS and GAM, retain some degree of interpretability. I find that while there is heterogeneity in how we gain interpretability, what interpretability is in particular cases can be explicated in a clear manner.
Paper Structure (19 sections, 5 equations, 4 figures)

This paper contains 19 sections, 5 equations, 4 figures.

Figures (4)

  • Figure 1: Linear model with inputs $X_1, X_2$ and output $Y$; the red dots are the data points to be approximated by $Y = f(X_1, X_2)$. From hasti2009, © by Hastie, Tibshirani & Friedman.
  • Figure 2: CART: A binary regression tree in two variables (left) and the corresponding regression function (right). From hasti2009, © by Hastie, Tibshirani & Friedman.
  • Figure 3: Pair of basis functions (ReLUs) for MARS. From hasti2009, © by Hastie, Tibshirani & Friedman.
  • Figure 4: A prediction function resulting from the MARS procedure; the function is $h(X_1, X_2) = (X_1-x_{51})_+\cdot(x_{72}-X_2)_+$; $x_{51}$ and $x_{72}$ are data points. From hasti2009, © by Hastie, Tibshirani & Friedman.