Fast Linear Model Trees by PILOT

Jakob Raymaekers; Peter J. Rousseeuw; Tim Verdonck; Ruicong Yao

Fast Linear Model Trees by PILOT

Jakob Raymaekers, Peter J. Rousseeuw, Tim Verdonck, Ruicong Yao

TL;DR

PILOT introduces a fast, pruning-free linear model tree that combines greedy tree growth with $L^2$ boosting and a per-node $BIC$-based model selection to fit leaf-linear models, maintaining CART-like time/space complexity. The method yields interpretable, additive piecewise-linear predictions and includes two truncation schemes to stabilize extrapolation, addressing weaknesses of previous linear-model trees. The authors prove universal consistency under additive models with a $O(1/ ext{log} n)$ rate and show faster convergence on truly linear data under spectral conditions, with empirical results across 20 datasets demonstrating competitive or superior performance to CART, M5, and FRIED. The work highlights PILOT’s potential as a scalable, explainable base learner for ensembles and real-world applications where linear structure is present.

Abstract

Linear model trees are regression trees that incorporate linear models in the leaf nodes. This preserves the intuitive interpretation of decision trees and at the same time enables them to better capture linear relationships, which is hard for standard decision trees. But most existing methods for fitting linear model trees are time consuming and therefore not scalable to large data sets. In addition, they are more prone to overfitting and extrapolation issues than standard regression trees. In this paper we introduce PILOT, a new algorithm for linear model trees that is fast, regularized, stable and interpretable. PILOT trains in a greedy fashion like classic regression trees, but incorporates an $L^2$ boosting approach and a model selection rule for fitting linear models in the nodes. The abbreviation PILOT stands for $PI$ecewise $L$inear $O$rganic $T$ree, where `organic' refers to the fact that no pruning is carried out. PILOT has the same low time and space complexity as CART without its pruning. An empirical study indicates that PILOT tends to outperform standard decision trees and other linear model trees on a variety of data sets. Moreover, we prove its consistency in an additive model setting under weak assumptions. When the data is generated by a linear model, the convergence rate is polynomial.

Fast Linear Model Trees by PILOT

TL;DR

PILOT introduces a fast, pruning-free linear model tree that combines greedy tree growth with

boosting and a per-node

-based model selection to fit leaf-linear models, maintaining CART-like time/space complexity. The method yields interpretable, additive piecewise-linear predictions and includes two truncation schemes to stabilize extrapolation, addressing weaknesses of previous linear-model trees. The authors prove universal consistency under additive models with a

rate and show faster convergence on truly linear data under spectral conditions, with empirical results across 20 datasets demonstrating competitive or superior performance to CART, M5, and FRIED. The work highlights PILOT’s potential as a scalable, explainable base learner for ensembles and real-world applications where linear structure is present.

Abstract

boosting approach and a model selection rule for fitting linear models in the nodes. The abbreviation PILOT stands for

ecewise

inear

rganic

ree, where `organic' refers to the fact that no pruning is carried out. PILOT has the same low time and space complexity as CART without its pruning. An empirical study indicates that PILOT tends to outperform standard decision trees and other linear model trees on a variety of data sets. Moreover, we prove its consistency in an additive model setting under weak assumptions. When the data is generated by a linear model, the convergence rate is polynomial.

Paper Structure (29 sections, 13 theorems, 88 equations, 2 figures, 3 tables, 4 algorithms)

This paper contains 29 sections, 13 theorems, 88 equations, 2 figures, 3 tables, 4 algorithms.

Introduction
Methodology
Main structure of PILOT
Models used in the nodes
Model selection rule
Truncation of predictions
Stopping rules versus pruning
Time and space complexity
Theoretical results
Universal consistency
Convergence rates on linear models
Empirical evaluation
Data sets and methods
Results
Results after transforming predictors
...and 14 more sections

Key Result

Proposition 1

PILOT has the same time and space complexities as CART without its pruning.

Figures (2)

Figure 1: An example of a PILOT tree.
Figure 2: Left: An example of the first truncation method in one node of the tree. Right: An example of the second truncation method.

Theorems & Definitions (26)

Proposition 1
proof
Theorem 2
Remark 3
Proposition 4
Theorem 5
Theorem 6
Corollary 7: Fast convergence on linear models
Proposition 8
proof
...and 16 more

Fast Linear Model Trees by PILOT

TL;DR

Abstract

Fast Linear Model Trees by PILOT

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (26)