Regression Trees Know Calculus
Nathan Wycoff
TL;DR
This work introduces a Tree-Based Gradient Estimator (TBGE) that extracts gradient information from regression trees with piecewise-constant leaves, enabling gradient-based interpretability and uncertainty quantification for tree models. By defining local node gradients $\gamma_i$ and aggregating them across tree depth, it yields $\tilde{\nabla} f(\mathbf{x})$ and supports global (TBAS) and local (TBIG) analyses via Monte Carlo and partition-based evaluations. The authors prove consistency of the gradient- and integro-differential estimators and demonstrate practical gains in predictive performance, dimension reduction, and interpretability on real and synthetic datasets, with applications to MNIST and high-dimensional mortality data. This work opens pathways for integrating gradient-based UQ and interpretability techniques from differentiable models into the non-smooth, scalable realm of regression trees.
Abstract
Regression trees have emerged as a preeminent tool for solving real-world regression problems due to their ability to deal with nonlinearities, interaction effects and sharp discontinuities. In this article, we rather study regression trees applied to well-behaved, differentiable functions, and determine the relationship between node parameters and the local gradient of the function being approximated. We find a simple estimate of the gradient which can be efficiently computed using quantities exposed by popular tree learning libraries. This allows the tools developed in the context of differentiable algorithms, like neural nets and Gaussian processes, to be deployed to tree-based models. To demonstrate this, we study measures of model sensitivity defined in terms of integrals of gradients and demonstrate how to compute them for regression trees using the proposed gradient estimates. Quantitative and qualitative numerical experiments reveal the capability of gradients estimated by regression trees to improve predictive analysis, solve tasks in uncertainty quantification, and provide interpretation of model behavior.
