Table of Contents
Fetching ...

Inherently Interpretable Tree Ensemble Learning

Zebin Yang, Agus Sudjianto, Xiaoming Li, Aijun Zhang

TL;DR

It is demonstrated that when shallow decision trees are used as base learners, the ensemble learning algorithms can not only become inherently interpretable subject to an equivalent representation as the generalized additive models but also sometimes lead to better generalization performance.

Abstract

Tree ensemble models like random forests and gradient boosting machines are widely used in machine learning due to their excellent predictive performance. However, a high-performance ensemble consisting of a large number of decision trees lacks sufficient transparency and explainability. In this paper, we demonstrate that when shallow decision trees are used as base learners, the ensemble learning algorithms can not only become inherently interpretable subject to an equivalent representation as the generalized additive models but also sometimes lead to better generalization performance. First, an interpretation algorithm is developed that converts the tree ensemble into the functional ANOVA representation with inherent interpretability. Second, two strategies are proposed to further enhance the model interpretability, i.e., by adding constraints in the model training stage and post-hoc effect pruning. Experiments on simulations and real-world datasets show that our proposed methods offer a better trade-off between model interpretation and predictive performance, compared with its counterpart benchmarks.

Inherently Interpretable Tree Ensemble Learning

TL;DR

It is demonstrated that when shallow decision trees are used as base learners, the ensemble learning algorithms can not only become inherently interpretable subject to an equivalent representation as the generalized additive models but also sometimes lead to better generalization performance.

Abstract

Tree ensemble models like random forests and gradient boosting machines are widely used in machine learning due to their excellent predictive performance. However, a high-performance ensemble consisting of a large number of decision trees lacks sufficient transparency and explainability. In this paper, we demonstrate that when shallow decision trees are used as base learners, the ensemble learning algorithms can not only become inherently interpretable subject to an equivalent representation as the generalized additive models but also sometimes lead to better generalization performance. First, an interpretation algorithm is developed that converts the tree ensemble into the functional ANOVA representation with inherent interpretability. Second, two strategies are proposed to further enhance the model interpretability, i.e., by adding constraints in the model training stage and post-hoc effect pruning. Experiments on simulations and real-world datasets show that our proposed methods offer a better trade-off between model interpretation and predictive performance, compared with its counterpart benchmarks.

Paper Structure

This paper contains 23 sections, 10 equations, 11 figures, 4 tables.

Figures (11)

  • Figure 1: The number of selected effects and 5-fold cross-validation performance under different regularization strengths of Lasso for the Friedman dataset.
  • Figure 2: The effect and feature importance of the Friedman dataset.
  • Figure 3: The fitted results of XGB-2 vs. the ground truth of the Friedman dataset.
  • Figure 4: A demo of local explanation of the Friedman dataset.
  • Figure 5: The number of selected effects and 5-fold cross-validation performance under different regularization strengths of Lasso for the CreditSimu dataset.
  • ...and 6 more figures