Table of Contents
Fetching ...

Bayesian Additive Regression Trees for functional ANOVA model

Seokhun Park, Insung Kong, Yongdai Kim

Abstract

Bayesian Additive Regression Trees (BART) is a powerful statistical model that leverages the strengths of Bayesian inference and regression trees. It has received significant attention for capturing complex non-linear relationships and interactions among predictors. However, the accuracy of BART often comes at the cost of interpretability. To address this limitation, we propose ANOVA Bayesian Additive Regression Trees (ANOVA-BART), a novel extension of BART based on the functional ANOVA decomposition, which is used to decompose the variability of a function into different interactions, each representing the contribution of a different set of covariates or factors. Our proposed ANOVA-BART enhances interpretability, preserves and extends the theoretical guarantees of BART, and achieves comparable prediction performance. Specifically, we establish that the posterior concentration rate of ANOVA-BART is nearly minimax optimal, and further provides the same convergence rates for each interaction that are not available for BART. Moreover, comprehensive experiments confirm that ANOVA-BART is comparable to BART in both accuracy and uncertainty quantification, while also demonstrating its effectiveness in component selection. These results suggest that ANOVA-BART offers a compelling alternative to BART by balancing predictive accuracy, interpretability, and theoretical consistency.

Bayesian Additive Regression Trees for functional ANOVA model

Abstract

Bayesian Additive Regression Trees (BART) is a powerful statistical model that leverages the strengths of Bayesian inference and regression trees. It has received significant attention for capturing complex non-linear relationships and interactions among predictors. However, the accuracy of BART often comes at the cost of interpretability. To address this limitation, we propose ANOVA Bayesian Additive Regression Trees (ANOVA-BART), a novel extension of BART based on the functional ANOVA decomposition, which is used to decompose the variability of a function into different interactions, each representing the contribution of a different set of covariates or factors. Our proposed ANOVA-BART enhances interpretability, preserves and extends the theoretical guarantees of BART, and achieves comparable prediction performance. Specifically, we establish that the posterior concentration rate of ANOVA-BART is nearly minimax optimal, and further provides the same convergence rates for each interaction that are not available for BART. Moreover, comprehensive experiments confirm that ANOVA-BART is comparable to BART in both accuracy and uncertainty quantification, while also demonstrating its effectiveness in component selection. These results suggest that ANOVA-BART offers a compelling alternative to BART by balancing predictive accuracy, interpretability, and theoretical consistency.

Paper Structure

This paper contains 81 sections, 11 theorems, 189 equations, 7 figures, 6 tables, 1 algorithm.

Key Result

Theorem 2.1

Any real-valued function $f$ defined on $\mathbb{R}^{p}$ can be uniquely decomposed as almost everywhere with respect to $\mu^{\text{ind}}$, under the constraint that each interaction $f_{S}$ satisfies the $\mu$-identifiability condition.

Figures (7)

  • Figure 1: Binary-product trees for $|S|$ being 1, 2, and 3, respectively. Nodes at the same depth share the same split rule, and an observation is assigned to the left child node whenever the rule is satisfied (otherwise, it goes to the right child)
  • Figure 2: Importance scores of the estimated components by ANOVA-BART for $p=10,50$ and $100$. The importance scores are normalized by dividing each score by the maximum importance score.
  • Figure 3: The functional relations of the two estimated components by ANOVA-BART on Boston data.
  • Figure C.4: Examples of binary-product tree and multinary-product tree. The left panel is a binary-product tree, while the right panel illustrates the partition and cell heights of a multinary-product tree with $\phi_{1}=5$ and $\phi_{2}=3$.
  • Figure E.5: Example of decomposing the partition of $f$ into the partitions of $f_{1}^{(1)}$ and $f_{2}^{(1)}$.
  • ...and 2 more figures

Theorems & Definitions (16)

  • Theorem 2.1: func_diagmcbook
  • Theorem 5.1: Posterior concentration rate of ANOVA-BART
  • Theorem 5.2: Posterior concentration rate of each component
  • Theorem B.1
  • Lemma B.2
  • proof
  • Theorem C.1
  • proof
  • Theorem C.2
  • Lemma C.3
  • ...and 6 more