Statistical Inference for High-Dimensional Robust Linear Regression Models via Recursive Online-Score Estimation
Dian Zheng, Lingzhou Xue
TL;DR
The paper tackles inference for high-dimensional robust linear regression under nonconvex penalized M-estimation. It extends recursive online-score estimation (ROSE) to robust settings with a data-driven active-set, initial nonconvex estimator, and online estimating equations to produce valid confidence intervals for a low-dimensional coefficient. Key contributions include a nonconvex landscape analysis with estimation bounds, a computationally efficient composite gradient algorithm, and a proof of asymptotic normality for CI construction, supported by simulations under contamination and heavy tails and a riboflavin-data application. The work delivers a principled, scalable toolkit for robust, high-dimensional inference that remains reliable when standard convex methods falter due to heavy-tailed noise or outliers.
Abstract
This paper introduces a novel framework for estimation and inference in penalized M-estimators applied to robust high-dimensional linear regression models. Traditional methods for high-dimensional statistical inference, which predominantly rely on convex likelihood-based approaches, struggle to address the nonconvexity inherent in penalized M-estimation with nonconvex objective functions. Our proposed method extends the recursive online score estimation (ROSE) framework of Shi et al. (2021) to robust high-dimensional settings by developing a recursive score equation based on penalized M-estimation, explicitly addressing nonconvexity. We establish the statistical consistency and asymptotic normality of the resulting estimator, providing a rigorous foundation for valid inference in robust high-dimensional regression. The effectiveness of our method is demonstrated through simulation studies and a real-world application, showcasing its superior performance compared to existing approaches.
