High-dimensional inference for single-index model with latent factors
Yanmei Shi, Meiling Hao, Yanlin Tang, Heng Lian, Xu Guo
TL;DR
This work tackles high-dimensional regression with latent factors by introducing the Factor Augmented Sparse Single Index Model (FASIM), which captures nonlinear covariate–response relationships while accounting for latent structure. It develops a fast factor-adequacy test (FAST) based on a score-type statistic and a Gaussian multiplier bootstrap that avoids estimating high-dimensional coefficients or precision matrices. When the factor model is deemed adequate, the paper proposes regularized estimation with a subsequent debiased inference procedure to obtain valid coefficient-wise confidence intervals under minimal moment conditions. The approach is shown to be robust to heavy-tailed errors and outliers, with strong finite-sample performance in simulations and a real-data macroeconomic analysis (FRED-MD). Overall, the framework provides scalable, robust tools for inference in high-dimensional factor-augmented settings and offers practical guarantees for both testing and estimation.
Abstract
Models with latent factors recently attract a lot of attention. However, most investigations focus on linear regression models and thus cannot capture nonlinearity. To address this issue, we propose a novel Factor Augmented Single-Index Model. We first address the concern whether it is necessary to consider the augmented part by introducing a score-type test statistic. Compared with previous test statistics, our proposed test statistic does not need to estimate the high-dimensional regression coefficients, nor high-dimensional precision matrix, making it simpler in implementation. We also propose a Gaussian multiplier bootstrap to determine the critical value. The validity of our procedure is theoretically established under suitable conditions. We further investigate the penalized estimation of the regression model. With estimated latent factors, we establish the error bounds of the estimators. Lastly, we introduce debiased estimator and construct confidence interval for individual coefficient based on the asymptotic normality. No moment condition for the error term is imposed for our proposal. Thus our procedures work well when random error follows heavy-tailed distributions or when outliers are present. We demonstrate the finite sample performance of the proposed method through comprehensive numerical studies and its application to an FRED-MD macroeconomics dataset.
