Variable Selection and Minimax Prediction in High-dimensional Functional Linear Model

Xingche Guo; Yehua Li; Tailen Hsing

Variable Selection and Minimax Prediction in High-dimensional Functional Linear Model

Xingche Guo, Yehua Li, Tailen Hsing

TL;DR

The paper develops a RKHS-based functional elastic-net for high-dimensional functional linear models, where each predictor is an infinite-dimensional function. It proves existence and uniqueness of the estimator, derives non-asymptotic tail bounds for variable selection consistency under a functional irrepresentable condition, and demonstrates that a post-selection refined estimator achieves the oracle minimax prediction rate even when the number of true predictors grows with sample size. A scalable reduced-rank algorithm with a block-coordinate scheme is proposed and validated through extensive simulations and a real Human Connectome Project dataset, where 33 brain ROIs are consistently linked to fluid intelligence. The work advances theory and computation for ultra-high-dimensional functional regression, offering practical tools for variable selection and minimax-optimal prediction in infinite-dimensional settings.

Abstract

High-dimensional functional data have become increasingly prevalent in modern applications such as high-frequency financial data and neuroimaging data analysis. We investigate a class of high-dimensional linear regression models, where each predictor is a random element in an infinite-dimensional function space, and the number of functional predictors p can potentially be ultra-high. Assuming that each of the unknown coefficient functions belongs to some reproducing kernel Hilbert space (RKHS), we regularize the fitting of the model by imposing a group elastic-net type of penalty on the RKHS norms of the coefficient functions. We show that our loss function is Gateaux sub-differentiable, and our functional elastic-net estimator exists uniquely in the product RKHS. Under suitable sparsity assumptions and a functional version of the irrepresentable condition, we derive a non-asymptotic tail bound for variable selection consistency of our method. Allowing the number of true functional predictors $q$ to diverge with the sample size, we also show a post-selection refined estimator can achieve the oracle minimax optimal prediction rate. The proposed methods are illustrated through simulation studies and a real-data application from the Human Connectome Project.

Variable Selection and Minimax Prediction in High-dimensional Functional Linear Model

TL;DR

Abstract

to diverge with the sample size, we also show a post-selection refined estimator can achieve the oracle minimax optimal prediction rate. The proposed methods are illustrated through simulation studies and a real-data application from the Human Connectome Project.

Paper Structure (24 sections, 28 theorems, 210 equations, 7 figures, 4 tables)

This paper contains 24 sections, 28 theorems, 210 equations, 7 figures, 4 tables.

Introduction
Functional Elastic-Net Regression
Model Assumptions
Functional Elastic-Net Based on RKHS
Theoretical Results
Consistency property of variable selection
Oracle minimax optimal rate and a post-selection refined estimator
Implementation and Numerical Studies
Practical Implementation
Simulation Studies
Real Data Application
Summary
Technical Details
Karush-Kuhn-Tucker Conditions in Function Spaces
Partially Separable Covariance Structure
...and 9 more sections

Key Result

Proposition 1

Suppose that Condition C.ass:a1 holds. Then, for each $j=1,\ldots,p$, any minimizer $\widehat{f_j}$ of equ:mini1 must be in the space $\mathbb{M}_{nj}$.

Figures (7)

Figure 1: Simulation Scenario I: The ROC curves of fEnet and FLR-SCAD under the ultra-high-dimension setting $(n, p, q) = (100, 200, 10)$. The ROC curves are obtained by changing the value of $\lambda$ and holding other hyperparameters at optimal.
Figure 1: Simulation Scenario I I: the ROC curves of fEnet and FLR-SCAD under the ultra high-dimensional case. The ROC curves are obtained by changing the value of $\lambda$ and holding other hyperparameters as optimal.
Figure 2: Simulation Scenario I: The plots of FPR, FNR, and RER versus $\log_{10}(1-\alpha)$ for different values of $\theta$ under the ultra-high-dimensional case and $\rho=0.75$.
Figure 2: Simulation Scenario I I: the plots of FPR, FNR, and RER versus $\log_{10}(1-\alpha)$ for different values of $\theta$ under the ultra high-dimensional case.
Figure 3: The orthographic projections of a brain (light blue), where the 33 selected ROIs using the HCP data are marked in dark blue.
...and 2 more figures

Theorems & Definitions (56)

Remark 1
Remark 2
Proposition 1
Proposition 2
Remark 3
Theorem 1
Remark 4
Theorem 2
Corollary 1
Remark 5
...and 46 more

Variable Selection and Minimax Prediction in High-dimensional Functional Linear Model

TL;DR

Abstract

Variable Selection and Minimax Prediction in High-dimensional Functional Linear Model

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (56)