Policy Trees for Prediction: Interpretable and Adaptive Model Selection for Machine Learning

Dimitris Bertsimas; Matthew Peroni

Policy Trees for Prediction: Interpretable and Adaptive Model Selection for Machine Learning

Dimitris Bertsimas, Matthew Peroni

TL;DR

The paper introduces Optimal Predictive-Policy Trees (OP2Ts), a prescriptive, tree-based framework for adaptive model selection and ensembleing that yields interpretable policies while optionally allowing rejection. By optimizing over a finite set of model treatments and leveraging rewards computed from model outputs, OP2Ts provide region-specific prescriptions that can outperform single models or standard meta-ensemble approaches, especially in non-realizable settings. The method supports both classification and regression, includes parameterized rejection, and is grounded in rigorous theory about informative policies and rejection dynamics. Empirically, OP2Ts demonstrate strong performance across benchmarks, synthetic examples, physics-informed tasks, NLP sentiment analysis, and a large medical dataset (MIMIC-IV), while maintaining interpretability and stability of the learned policies.

Abstract

As a multitude of capable machine learning (ML) models become widely available in forms such as open-source software and public APIs, central questions remain regarding their use in real-world applications, especially in high-stakes decision-making. Is there always one best model that should be used? When are the models likely to be error-prone? Should a black-box or interpretable model be used? In this work, we develop a prescriptive methodology to address these key questions, introducing a tree-based approach, Optimal Predictive-Policy Trees (OP2T), that yields interpretable policies for adaptively selecting a predictive model or ensemble, along with a parameterized option to reject making a prediction. We base our methods on learning globally optimized prescriptive trees. Our approach enables interpretable and adaptive model selection and rejection while only assuming access to model outputs. By learning policies over different feature spaces, including the model outputs, our approach works with both structured and unstructured datasets. We evaluate our approach on real-world datasets, including regression and classification tasks with both structured and unstructured data. We demonstrate that our approach provides both strong performance against baseline methods while yielding insights that help answer critical questions about which models to use, and when.

Policy Trees for Prediction: Interpretable and Adaptive Model Selection for Machine Learning

TL;DR

Abstract

Paper Structure (31 sections, 4 theorems, 55 equations, 15 figures, 7 tables)

This paper contains 31 sections, 4 theorems, 55 equations, 15 figures, 7 tables.

Introduction
Optimal Policy Trees
Adaptive Model Selection for Classification and Regression
An Illustrative Example
Adaptive Model Selection for Classification
Classification with Rejection
Adaptive Model Selection for Regression
Alternative Approaches
Theoretical Insights
Sufficient Conditions for Learning an Informative Policy
The Advantage of Prescription over Prediction
The Quality of OP2T Policies and the Impact of Rejection Learning
Experiments
Benchmarks
A Toy 1-D Example
...and 16 more sections

Key Result

Proposition 3

Let $\mathcal{Z} \subseteq \mathbb{R}^d$ be the feature space used to learn some policy function $\pi$, $\mathcal{D}$ some distribution over $\mathcal{X}$, a function $v: \mathcal{Z} \rightarrow 2^{\mathcal{X}}$ mapping elements of $\mathcal{Z}$ to disjoint subsets of the original feature space $\ma

Figures (15)

Figure 1: An illustrative example of the OP2T approach to adaptive model selection.
Figure 2: An example of two non-convex reward surfaces with a square dominated subspace.
Figure 3: A simple 1-D example of synthetic model rewards with different rejection thresholds, denoted by the red dashed horizontal lines. In this case, the feature space $\mathcal{X} = [0,12]$ and the rejection parameters are $\alpha = 0.1$ and $\alpha = 0.3$.
Figure 4: OP2T models fit on the 1-D synthetic data described in Section \ref{['sec:synthetic']}. Tree (a) corresponds to no rejection, (b) to rejection with $\alpha = 0.1$, and (c) to rejection with $\alpha=0.3$.
Figure 5: A simple 1-D example of synthetic model rewards demonstrating the potential difference between taking a prescriptive versus a predictive approach to model selection.
...and 10 more figures

Theorems & Definitions (7)

Definition 1
Definition 2
Proposition 3
Proposition 4
Definition 5
Proposition 6
Proposition 7

Policy Trees for Prediction: Interpretable and Adaptive Model Selection for Machine Learning

TL;DR

Abstract

Policy Trees for Prediction: Interpretable and Adaptive Model Selection for Machine Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (15)

Theorems & Definitions (7)