Jacobian Aligned Random Forests

Sarwesh Rauniyar

Jacobian Aligned Random Forests

Sarwesh Rauniyar

TL;DR

JARF introduces a single EJOP/EGOP-based global preconditioner that rotates and scales input features to align predictive directions with axis-aligned splits. By applying this preconditioning before standard random forests or boosted trees, JARF captures oblique decision boundaries with minimal changes to existing training pipelines. The approach is supported by theory linking EJOP to CART impurity gains and by empirical results showing competitive or superior performance to oblique forests across classification and regression benchmarks, with favorable training times. The work highlights supervised, gradient-based geometry as a robust, model-agnostic means to enhance tabular tree ensembles while preserving their speed and robustness.

Abstract

Axis-aligned decision trees are fast and stable but struggle on datasets with rotated or interaction-dependent decision boundaries, where informative splits require linear combinations of features rather than single-feature thresholds. Oblique forests address this with per-node hyperplane splits, but at added computational cost and implementation complexity. We propose a simple alternative: JARF, Jacobian-Aligned Random Forests. Concretely, we first fit an axis-aligned forest to estimate class probabilities or regression outputs, compute finite-difference gradients of these predictions with respect to each feature, aggregate them into an expected Jacobian outer product that generalizes the expected gradient outer product (EGOP), and use it as a single global linear preconditioner for all inputs. This supervised preconditioner applies a single global rotation of the feature space, then hands the transformed data back to a standard axis-aligned forest, preserving off-the-shelf training pipelines while capturing oblique boundaries and feature interactions that would otherwise require many axis-aligned splits to approximate. The same construction applies to any model that provides gradients, though we focus on random forests and gradient-boosted trees in this work. On tabular classification and regression benchmarks, this preconditioning consistently improves axis-aligned forests and often matches or surpasses oblique baselines while improving training time. Our experimental results and theoretical analysis together indicate that supervised preconditioning can recover much of the accuracy of oblique forests while retaining the simplicity and robustness of axis-aligned trees.

Jacobian Aligned Random Forests

TL;DR

Abstract

Jacobian Aligned Random Forests

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (13)