Learning Performance Maximizing Ensembles with Explainability Guarantees

Vincent Pisztora; Jia Li

Learning Performance Maximizing Ensembles with Explainability Guarantees

Vincent Pisztora, Jia Li

TL;DR

This work tackles the problem of achieving high predictive performance while preserving intrinsic explainability in high-stakes settings by partitioning observations between a glass-box model $g$ and a black-box model $b$ through an explainability-controlled ensemble (EEG). The authors introduce a ranking-based allocation $a'_q(v)$, governed by $q$ and a robust score $r(z)$ that blends sufficiency indicators with absolute prediction error in a logistic form, and prove several optimality properties (maximal sufficient performance, maximal sufficient explainable performance, conditional maximal absolute performance, monotone allocation). Empirically, EEG demonstrates strong and consistent gains across 31 tabular datasets, achieving a cross-dataset PPCR of $37\%$, average PQEOM of $74\%$, and 95TQM of $94\%$, with many cases where the ensemble matches or exceeds the best component models while maintaining explainability for a large fraction of observations. The results further show that using a comprehensive feature set for the allocator and exploring multiple component-model pairings yields robust allocations, with occasional substantial gains on specific datasets. Overall, EEG offers a principled, model-agnostic approach to balancing performance and explainability, with practical implications for deploying interpretable yet accurate systems in high-stakes domains.

Abstract

In this paper we propose a method for the optimal allocation of observations between an intrinsically explainable glass box model and a black box model. An optimal allocation being defined as one which, for any given explainability level (i.e. the proportion of observations for which the explainable model is the prediction function), maximizes the performance of the ensemble on the underlying task, and maximizes performance of the explainable model on the observations allocated to it, subject to the maximal ensemble performance condition. The proposed method is shown to produce such explainability optimal allocations on a benchmark suite of tabular datasets across a variety of explainable and black box model types. These learned allocations are found to consistently maintain ensemble performance at very high explainability levels (explaining $74\%$ of observations on average), and in some cases even outperforming both the component explainable and black box models while improving explainability.

Learning Performance Maximizing Ensembles with Explainability Guarantees

TL;DR

This work tackles the problem of achieving high predictive performance while preserving intrinsic explainability in high-stakes settings by partitioning observations between a glass-box model

and a black-box model

through an explainability-controlled ensemble (EEG). The authors introduce a ranking-based allocation

, governed by

and a robust score

that blends sufficiency indicators with absolute prediction error in a logistic form, and prove several optimality properties (maximal sufficient performance, maximal sufficient explainable performance, conditional maximal absolute performance, monotone allocation). Empirically, EEG demonstrates strong and consistent gains across 31 tabular datasets, achieving a cross-dataset PPCR of

, average PQEOM of

, and 95TQM of

, with many cases where the ensemble matches or exceeds the best component models while maintaining explainability for a large fraction of observations. The results further show that using a comprehensive feature set for the allocator and exploring multiple component-model pairings yields robust allocations, with occasional substantial gains on specific datasets. Overall, EEG offers a principled, model-agnostic approach to balancing performance and explainability, with practical implications for deploying interpretable yet accurate systems in high-stakes domains.

Abstract

of observations on average), and in some cases even outperforming both the component explainable and black box models while improving explainability.

Paper Structure (19 sections, 10 theorems, 6 equations, 4 figures, 7 tables)

This paper contains 19 sections, 10 theorems, 6 equations, 4 figures, 7 tables.

Introduction
Methodology
Setting
Optimal Allocation
Experiments
Datasets
Models
Hyperparameter Tuning
Metrics
Results
Ablation Studies
Allocator Feature Set Selection
Ensemble Component Model Selection
Acknowledgements
Appendix
...and 4 more sections

Key Result

Proposition 1

(Maximal Sufficient Performance) $\forall q \in \mathcal{Q}, a^{\prime}_{q} \in A^{*}_{q}$ where $A^{*}_{q}=\{a^{*}_{q} \colon a^{*}_{q}=\arg\max_{a_{q}\in A_{q}} S(a_q)\}$

Figures (4)

Figure 1: This figure shows a two-class classification task in which the areas of expertise (the diamond pattern for the glass box and the spiral pattern for the black box model) are complementary. The glass box achieves a $92.7\%$ accuracy, the black box reaches $95.0\%$ accuracy, and the allocated ensemble of the two exceeds both with a $95.8\%$ accuracy. Thus, the resulting EEG allocation improves performance over both component models while also providing explainability (for $20\%$ of observations in this case).
Figure 2: This figure provides an intuition for how the proposed allocator $a^{\prime}_q(v)$ ranks observations. This ranking can be seen as ordering the sufficiency sets ($Z_b, Z_0, Z_2, Z_g$) step-by-step, with each step achieving a particular optimality condition (formalized in Propositions \ref{['thm_1_perfmax']}, \ref{['thm_2_perfmaxg']}, and \ref{['thm_3_absperf']}). In the figure, individual observations are circles above the number line with size representing the magnitude of the relative advantage of black box over glass box, the left-hand color representing sufficiency of the glass box, and the right-hand color representing the sufficiency of the black box (green for sufficient, red for insufficient).
Figure 3: This figure shows two examples of the explainability (x-axis) vs sufficient performance (y-axis) trade-off, comparing the random (blue), oracle (orange), and learned (green) allocation curves. The PolR dataset is an example of complementary $g$ and $b$ models, resulting in an allocated ensemble that outperforms both component models across most of the $q$ range. The SuperconductR dataset is an example of an explainability "free lunch" in which the $b$ accuracy is maintained while increasing explainability using the allocator. Curves for all datasets are available in Section D of the Appendix
Figure 4: This figure provides the sufficient accuracy values of the allocated EEG ensemble for each explainability level and every dataset.

Theorems & Definitions (18)

Proposition 1
Proposition 2
Proposition 3
Proposition 4
Definition 1
Definition 2
Lemma 1
proof
Lemma 2
proof
...and 8 more

Learning Performance Maximizing Ensembles with Explainability Guarantees

TL;DR

Abstract

Learning Performance Maximizing Ensembles with Explainability Guarantees

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (18)