Selection of functional predictors and smooth coefficient estimation for scalar-on-function regression models
Hedayat Fathi, Marzia A. Cremona, Federico Severino
TL;DR
This paper tackles variable selection in scalar-on-function regression with many functional predictors by introducing SOFIA, an adaptive Lasso framework that enforces coefficient regularity by placing them in a Hilbert subspace $\mathbb{K}$ and solving a penalized least squares problem. A sieve-based finite-dimensional approximation on eigenfunctions of a trace-class operator $K$ enables scalable optimization via functional subgradients, and the authors prove a functional oracle property under a Hilbert-space restricted eigenvalue condition, with convergence rates tied to the sieve dimension $m$. Through extensive simulations and a real GDP-growth application, SOFIA demonstrates strong active-variable recovery and competitive or superior predictive performance relative to existing methods, while providing smooth, interpretable coefficient estimates. The work advances functional data analysis by combining variable selection, regularization in a RKHS-like setting, and solid theoretical guarantees, with practical coverage for high-dimensional functional data analysis.
Abstract
In the framework of scalar-on-function regression models, in which several functional variables are employed to predict a scalar response, we propose a methodology for selecting relevant functional predictors while simultaneously providing accurate smooth (or, more generally, regular) estimates of the functional coefficients. We suppose that the functional predictors belong to a real separable Hilbert space, while the functional coefficients belong to a specific subspace of this Hilbert space. Such a subspace can be a Reproducing Kernel Hilbert Space (RKHS) to ensure the desired regularity characteristics, such as smoothness or periodicity, for the coefficient estimates. Our procedure, called SOFIA (Scalar-On-Function Integrated Adaptive Lasso), is based on an adaptive penalized least squares algorithm that leverages functional subgradients to efficiently solve the minimization problem. We demonstrate that the proposed method satisfies the functional oracle property, even when the number of predictors exceeds the sample size. SOFIA's effectiveness in variable selection and coefficient estimation is evaluated through extensive simulation studies and a real-data application to GDP growth prediction.
