Table of Contents
Fetching ...

Improving the Validity of Decision Trees as Explanations

Jiri Nemecek, Tomas Pevny, Jakub Marecek

TL;DR

This work addresses the validity of decision-tree explanations by focusing on leaf-level accuracy. It introduces a mixed-integer optimization framework that trains shallow trees to maximize the minimum leaf accuracy $A_L(T)$, yielding globally interpretable rules with improved fairness signals. A secondary step extends nonempty leaves with per-leaf models (notably XGBoost) to form a hybrid-tree whose accuracy approaches state-of-the-art methods while maintaining a global explanation. Empirically, leaf-accuracy gains of about 7 percentage points on tabular benchmarks are demonstrated, and statistical tests confirm significant improvements over CART, with hybrid-tree variants achieving competitive performance relative to XGBoost. This approach offers a principled, expandable path toward explanations that remain valid across subgroups while preserving practical predictive power on tabular data.

Abstract

In classification and forecasting with tabular data, one often utilizes tree-based models. Those can be competitive with deep neural networks on tabular data and, under some conditions, explainable. The explainability depends on the depth of the tree and the accuracy in each leaf of the tree. We point out that decision trees containing leaves with unbalanced accuracy can provide misleading explanations. Low-accuracy leaves give less valid explanations, which could be interpreted as unfairness among subgroups utilizing these explanations. Here, we train a shallow tree with the objective of minimizing the maximum misclassification error across all leaf nodes. The shallow tree provides a global explanation, while the overall statistical performance of the shallow tree can become comparable to state-of-the-art methods (e.g., well-tuned XGBoost) by extending the leaves with further models.

Improving the Validity of Decision Trees as Explanations

TL;DR

This work addresses the validity of decision-tree explanations by focusing on leaf-level accuracy. It introduces a mixed-integer optimization framework that trains shallow trees to maximize the minimum leaf accuracy , yielding globally interpretable rules with improved fairness signals. A secondary step extends nonempty leaves with per-leaf models (notably XGBoost) to form a hybrid-tree whose accuracy approaches state-of-the-art methods while maintaining a global explanation. Empirically, leaf-accuracy gains of about 7 percentage points on tabular benchmarks are demonstrated, and statistical tests confirm significant improvements over CART, with hybrid-tree variants achieving competitive performance relative to XGBoost. This approach offers a principled, expandable path toward explanations that remain valid across subgroups while preserving practical predictive power on tabular data.

Abstract

In classification and forecasting with tabular data, one often utilizes tree-based models. Those can be competitive with deep neural networks on tabular data and, under some conditions, explainable. The explainability depends on the depth of the tree and the accuracy in each leaf of the tree. We point out that decision trees containing leaves with unbalanced accuracy can provide misleading explanations. Low-accuracy leaves give less valid explanations, which could be interpreted as unfairness among subgroups utilizing these explanations. Here, we train a shallow tree with the objective of minimizing the maximum misclassification error across all leaf nodes. The shallow tree provides a global explanation, while the overall statistical performance of the shallow tree can become comparable to state-of-the-art methods (e.g., well-tuned XGBoost) by extending the leaves with further models.
Paper Structure (30 sections, 2 equations, 21 figures, 11 tables)

This paper contains 30 sections, 2 equations, 21 figures, 11 tables.

Figures (21)

  • Figure 1: An example of the decision tree produced by the proposed model for the COMPAS dataset, cf. Figure \ref{['fig:motiv']}. The bold percentage shows the leaf accuracy in each leaf on out-of-sample data before applying the extending model. Below that, in regular font, we provide accuracy on training data.
  • Figure 2: A comparison of decision trees produced by CART and our method for pol dataset. In each leaf, bold/regular percentage shows the leaf accuracy before extending it further on the test/training data set, respectively. Below the name of the model, we present the (hybrid-tree) accuracy of the hybrid/shallow tree in bold/regular font. The CART tree contains a leaf with a notably lower accuracy compared to the overall accuracy of the model. The explanation provided by this leaf is less valid. This makes the global explanation provided by the tree less fair. While model accuracies do not take this into account, the proposed measure of leaf accuracy does. The left and right trees have leaf accuracy on unseen data equal 57.1% and 86.5%, respectively.
  • Figure 3: Performance on the COMPAS dataset: Mean statistical performance over 10 different train-test splits, evaluated in terms of model accuracy (horizontal axis) and leaf accuracy (vertical axis) for five variants of (hybrid) decision trees. The horizontal and vertical error bars are standard deviations across the 10 random runs. Notice that the proposed model has better interpretability compared to any standard decision tree and, once extended, accuracy comparable to a gradient-boosted tree.
  • Figure 4: The complete MIO formulation of training the shallow tree, maximizing the minimum accuracy across all leaf nodes, and constraining the number of samples per leaf node. The constraints (\ref{['eq:orig1']} -- \ref{['eq:origX']}) are taken from the optimal decision trees of bertsimasOptimalClassificationTrees2017, and the remaining constraints in purple (\ref{['eq:Qltsum']} -- \ref{['eq:accequal2']}) and \ref{['eq:constrs']}, together with a different objective function \ref{['eq:objective']}, are parts of our extensions. We use $[n]$ notation to represent the set of integers $\{1, 2, 3, \ldots, n\}$. An overview table of the variables and parameters is in the Appendix (Table \ref{['tab:formdesc']}).
  • Figure 5: Results on out-of-sample data. The plot shows a significant increase in leaf accuracy when using our method, significantly improving the validity of the explanations provided. It also shows an increase in model accuracy when extending the models with XGBoost models in leaves. The results of the OCT model serve to compare to the model we built upon.
  • ...and 16 more figures