Sparse semiparametric regression when predictors are mixture of functional and high-dimensional variables

Silvia Novo; Germán Aneiros; Philippe Vieu

Sparse semiparametric regression when predictors are mixture of functional and high-dimensional variables

Silvia Novo, Germán Aneiros, Philippe Vieu

TL;DR

A flexible model, combining both sparse linear ideas together with semiparametrics, is proposed, and a wide scope of asymptotic results is provided: this covers as well rates of convergence of the estimators as asymPTotic behaviour of the variable selection procedure.

Abstract

This paper aims to front with dimensionality reduction in regression setting when the predictors are a mixture of functional variable and high-dimensional vector. A flexible model, combining both sparse linear ideas together with semiparametrics, is proposed. A wide scope of asymptotic results is provided: this covers as well rates of convergence of the estimators as asymptotic behaviour of the variable selection procedure. Practical issues are analysed through finite sample simulated experiments while an application to Tecator's data illustrates the usefulness of our methodology.

Sparse semiparametric regression when predictors are mixture of functional and high-dimensional variables

TL;DR

Abstract

Paper Structure (27 sections, 18 theorems, 160 equations, 7 figures, 5 tables)

This paper contains 27 sections, 18 theorems, 160 equations, 7 figures, 5 tables.

Introduction
The model
The penalized least-squares estimators
Some initial notation
The estimators
Asymptotic theory
Some additional notation
Assumptions
Results
Simulation study
Design
Results
Real data application
Tecator's data
Modelling, variable selection and prediction
...and 12 more sections

Key Result

Theorem 4.2

Assume that the assumptions (centred_error), (cover-C), (Theta) and (fun_spaces_0)-(null_par3) hold. Assume, in addition, that $p_n\rightarrow\infty$ as $n\rightarrow\infty$, $p_n=o\left(n^{1/2}\right)$ and Then, there exists a local minimizer $\left(\widehat{\pmb{\beta}}_0,\widehat{\theta}_0\right)$ of $\mathcal{Q}\left(\pmb{\beta},\theta\right)$ such that (Note that $v_n$ was defined in (Theta

Figures (7)

Figure 1: Sample of 200 curves generated from (\ref{['X-sim']}) (left panel) and functional direction $\theta_0$ (right panel). In addition, in right panel, it is displayed the estimation, $\widehat{\theta}_0$, of $\theta_0$ obtained from a particular sample in the scenario $(n,p_n,\rho,c)=(100,50,0,0.05)$.
Figure 2: Percentage of times that each non-zero coefficient of $\pmb{\beta}_0$ is not set to zero (left panel: $c=0.05$; right panel: $c=0.01$). We use grey for $\beta_{02}=1.5$, pink for $\beta_{05}=2$ and blue for $\beta_{01}=3$. Dark colours correspond to $n=100$ while light colours match $n=200$. Values $\rho=0$ and $\rho=0.5$ are considered.
Figure 3: Boxplots of the squared errors (\ref{['error-m']}) obtained from the proposed procedure for the several considered scenarios. Left panel: $c=0.05$; right panel: $c=0.01$.
Figure 4: Real and estimated values, from a particular sample in the scenario $(n,p_n,\rho,c)=(100,50,0,0.05)$, related to the semiparametric component, $m\left(\left< \theta_0,\cdot\right>\right)$, of the SSFPLSIM (\ref{['mod_sim']}). The curve in the right panel is the true $m$.
Figure 5: Sample of 100 absorbance curves $\mathcal{X}$ (left panel) together with their second derivatives $\mathcal{X}^{(2)}$ (right panel).
...and 2 more figures

Theorems & Definitions (23)

Remark 4.1
Theorem 4.2
Remark 4.3
Theorem 4.4: Model selection consistency
Theorem 4.5: Asymptotic normality
Remark 4.6
Theorem 4.7
Corollary 4.8
Corollary 4.9
Remark 4.10
...and 13 more

Sparse semiparametric regression when predictors are mixture of functional and high-dimensional variables

TL;DR

Abstract

Sparse semiparametric regression when predictors are mixture of functional and high-dimensional variables

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (23)