Highly Adaptive Principal Component Regression
Mingxun Wang, Alejandro Schuler, Mark van der Laan, Carlos García Meixide
TL;DR
The paper tackles adaptive nonparametric regression with HAL/HAR while addressing scalability in high dimensions. It introduces Principal Components HAL and RIDGE (PCHAL/PCHAR), a low-rank, outcome-blind kernel representation that preserves HAL/HAR geometry by projecting onto leading principal components of the HAL Gram matrix, yielding closed-form solutions and substantial computational gains. A spectral link to discrete sine basis is established, justifying the truncation of PC components and providing a theoretical bound on excess risk relative to the full model. Empirically, PCHAR closely matches HAR and generally outperforms HAL in a wide range of simulations, with PCHAL offering competitive performance and sometimes ameliorating irregular signals; real-data benchmarks confirm the methods’ stability and practical appeal. The authors provide practical tuning and prediction strategies, along with open-source implementations, making scalable, adaptive nonparametric regression more accessible in high-dimensional settings.
Abstract
The Highly Adaptive Lasso (HAL) is a nonparametric regression method that achieves almost dimension-free convergence rates under minimal smoothness assumptions, but its implementation can be computationally prohibitive in high dimensions due to the large basis matrix it requires. The Highly Adaptive Ridge (HAR) has been proposed as a scalable alternative. Building on both procedures, we introduce the Principal Component based Highly Adaptive Lasso (PCHAL) and Principal Component based Highly Adaptive Ridge (PCHAR). These estimators constitute an outcome-blind dimension reduction which offer substantial gains in computational efficiency and match the empirical performances of HAL and HAR. We also uncover a striking spectral link between the leading principal components of the HAL/HAR Gram operator and a discrete sinusoidal basis, revealing an explicit Fourier-type structure underlying the PC truncation.
