Asymptotic Distribution of Low-Dimensional Patterns Induced by Non-Differentiable Regularizers under General Loss Functions
Ivan Hejný, Jonas Wallin, Małgorzata Bogdan
TL;DR
This work addresses the distribution of low-dimensional patterns induced by non-differentiable convex penalties in penalized M-estimation with fixed dimension $p$. It introduces stochastic Lipschitz differentiability (SLD) to extend Pollard-type CLTs to pattern convergence, showing that the rescaled estimator $\sqrt{n}(\hat{\theta}_n-\theta_0)$ converges to the minimizer of $V(u) = \tfrac{1}{2}u^TCu - u^TW + f'(\theta_0;u)$ with $W\sim\mathcal{N}(0,C_\triangle)$, and that under stronger conditions, the induced pattern $I_f(\hat{u}_n)$ converges in distribution and in pattern to $I_f(\hat{u})$. The framework applies to generalized linear models and robust losses (Huber, quantile), providing explicit limiting covariances and enabling model-recovery analysis, including vague clustering and residual-error measures. Simulations in SLOPE-regularized logistic regression confirm convergence of RMSE and pattern recovery to their asymptotic counterparts, illustrating practical implications for pattern identification and stable regularization. Overall, the results unify and extend classical asymptotic theory for Lasso-type penalties to a broader class of losses and models, with meaningful impact for pattern recovery in high-stability, low-dimensional structures.
Abstract
This article investigates the asymptotic distribution of penalized estimators with non-differentiable penalties designed to recover low-dimensional pattern structures. Patterns play a central role in estimation, as they reveal the underlying structure of the parameter -- which coefficients are zero, which are equal, and how they are clustered. The main technical challenge stems from the discontinuous nature of these patterns (such as the sign function in the case of the Lasso penalty), a difficulty not previously addressed in the literature and only recently analyzed for the standard linear model. To overcome this, we extend classical results from empirical process theory for M-estimation by incorporating the distributional behavior of model patterns. We introduce a new mathematical framework for studying pattern convergence of regularized M-estimators. While classical approaches to distributional convergence rely on uniform conditions, our analysis employs a new local condition, stochastic Lipschitz differentiability (SLD), which controls fluctuations of the Taylor remainder. We demonstrate how this framework applies to a broad class of loss functions, covering generalized linear models (e.g., logistic and Poisson regression) and robust regression settings with non-smooth losses such as the Huber and quantile loss.
