Highly Adaptive Principal Component Regression

Mingxun Wang; Alejandro Schuler; Mark van der Laan; Carlos García Meixide

Highly Adaptive Principal Component Regression

Mingxun Wang, Alejandro Schuler, Mark van der Laan, Carlos García Meixide

TL;DR

The paper tackles adaptive nonparametric regression with HAL/HAR while addressing scalability in high dimensions. It introduces Principal Components HAL and RIDGE (PCHAL/PCHAR), a low-rank, outcome-blind kernel representation that preserves HAL/HAR geometry by projecting onto leading principal components of the HAL Gram matrix, yielding closed-form solutions and substantial computational gains. A spectral link to discrete sine basis is established, justifying the truncation of PC components and providing a theoretical bound on excess risk relative to the full model. Empirically, PCHAR closely matches HAR and generally outperforms HAL in a wide range of simulations, with PCHAL offering competitive performance and sometimes ameliorating irregular signals; real-data benchmarks confirm the methods’ stability and practical appeal. The authors provide practical tuning and prediction strategies, along with open-source implementations, making scalable, adaptive nonparametric regression more accessible in high-dimensional settings.

Abstract

The Highly Adaptive Lasso (HAL) is a nonparametric regression method that achieves almost dimension-free convergence rates under minimal smoothness assumptions, but its implementation can be computationally prohibitive in high dimensions due to the large basis matrix it requires. The Highly Adaptive Ridge (HAR) has been proposed as a scalable alternative. Building on both procedures, we introduce the Principal Component based Highly Adaptive Lasso (PCHAL) and Principal Component based Highly Adaptive Ridge (PCHAR). These estimators constitute an outcome-blind dimension reduction which offer substantial gains in computational efficiency and match the empirical performances of HAL and HAR. We also uncover a striking spectral link between the leading principal components of the HAL/HAR Gram operator and a discrete sinusoidal basis, revealing an explicit Fourier-type structure underlying the PC truncation.

Highly Adaptive Principal Component Regression

TL;DR

Abstract

Paper Structure (28 sections, 6 theorems, 147 equations, 1 figure, 4 tables)

This paper contains 28 sections, 6 theorems, 147 equations, 1 figure, 4 tables.

Introduction
Contribution.
Highly Adaptive Lasso and Ridge
The Highly Adaptive kernel trick
PCHAL and PCHAR
Tuning
Prediction
Eigenspace of the Matrix
Controlling interaction order and oracle risk
Profiling over the regularization parameter
Maximum degree of interactions order
Simulations
Comparison with ML regressors
Data-generating processes
HAL design and HAL-family estimators
...and 13 more sections

Key Result

Theorem 1

Let $f: [0,1]^d \to \mathbb{R}$ be a càdlàg function with bounded Hardy–Krause variation anchored at 0, i.e., $V(f) < \infty$. Then $f$ admits the representation

Figures (1)

Figure 1: First six eigenvectors of $HH^\top$ (black) if there is a total order, overlaid with their closed-form discrete sine eigenfunctions (red).

Theorems & Definitions (13)

Definition 1
Definition 2
Definition 3
Definition 4
Definition 5
Remark 1
Theorem 1: Sectional Representation of Càdlàg Functions
Theorem 2: PCHA closed forms
Theorem 3: Eigenstructure of the zero-order HAL Gram matrix
Definition 6
...and 3 more

Highly Adaptive Principal Component Regression

TL;DR

Abstract

Highly Adaptive Principal Component Regression

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (13)