Table of Contents
Fetching ...

Oblique Bayesian additive regression trees

Paul-Hieu V. Nguyen, Ryan Yee, Sameer K. Deshpande

TL;DR

This work develops an oblique version of BART that leverages a data-adaptive decision rule prior that recursively partitions the feature space along random hyperplanes and is competitive with -- and sometimes much better than -- those methods.

Abstract

Current implementations of Bayesian Additive Regression Trees (BART) are based on axis-aligned decision rules that recursively partition the feature space using a single feature at a time. Several authors have demonstrated that oblique trees, whose decision rules are based on linear combinations of features, can sometimes yield better predictions than axis-aligned trees and exhibit excellent theoretical properties. We develop an oblique version of BART that leverages a data-adaptive decision rule prior that recursively partitions the feature space along random hyperplanes. Using several synthetic and real-world benchmark datasets, we systematically compared our oblique BART implementation to axis-aligned BART and other tree ensemble methods, finding that oblique BART was competitive with -- and sometimes much better than -- those methods.

Oblique Bayesian additive regression trees

TL;DR

This work develops an oblique version of BART that leverages a data-adaptive decision rule prior that recursively partitions the feature space along random hyperplanes and is competitive with -- and sometimes much better than -- those methods.

Abstract

Current implementations of Bayesian Additive Regression Trees (BART) are based on axis-aligned decision rules that recursively partition the feature space using a single feature at a time. Several authors have demonstrated that oblique trees, whose decision rules are based on linear combinations of features, can sometimes yield better predictions than axis-aligned trees and exhibit excellent theoretical properties. We develop an oblique version of BART that leverages a data-adaptive decision rule prior that recursively partitions the feature space along random hyperplanes. Using several synthetic and real-world benchmark datasets, we systematically compared our oblique BART implementation to axis-aligned BART and other tree ensemble methods, finding that oblique BART was competitive with -- and sometimes much better than -- those methods.

Paper Structure

This paper contains 15 sections, 6 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: Example of step functions defined over $[-1,1]^{2}$ and their corresponding axis-aligned (a) and oblique (b) regression tree representations.
  • Figure 2: True function (a,d), axis-aligned BART estimate (b,e), and obliqueBART estimate (c,f).
  • Figure 3: Cartoon illustration of a grow and prune move with oblique, continuous decision rules
  • Figure 4: Performance of axis-aligned BART (AA) and axis-aligned BART with random rotations relative to obliqueBART in terms of out-of-sample predictive error in the (a) rotated axes partition and (b) sinusoidal partition.
  • Figure 5: obliqueBART's SMSE (a) and accuracy (b) across all splits and datasets, compared to XGB, ERT, RF, and BART. Models with lower SMSE's and higher accuracies are preferred.
  • ...and 1 more figures