Table of Contents
Fetching ...

Asymptotic Distribution of Low-Dimensional Patterns Induced by Non-Differentiable Regularizers under General Loss Functions

Ivan Hejný, Jonas Wallin, Małgorzata Bogdan

TL;DR

This work addresses the distribution of low-dimensional patterns induced by non-differentiable convex penalties in penalized M-estimation with fixed dimension $p$. It introduces stochastic Lipschitz differentiability (SLD) to extend Pollard-type CLTs to pattern convergence, showing that the rescaled estimator $\sqrt{n}(\hat{\theta}_n-\theta_0)$ converges to the minimizer of $V(u) = \tfrac{1}{2}u^TCu - u^TW + f'(\theta_0;u)$ with $W\sim\mathcal{N}(0,C_\triangle)$, and that under stronger conditions, the induced pattern $I_f(\hat{u}_n)$ converges in distribution and in pattern to $I_f(\hat{u})$. The framework applies to generalized linear models and robust losses (Huber, quantile), providing explicit limiting covariances and enabling model-recovery analysis, including vague clustering and residual-error measures. Simulations in SLOPE-regularized logistic regression confirm convergence of RMSE and pattern recovery to their asymptotic counterparts, illustrating practical implications for pattern identification and stable regularization. Overall, the results unify and extend classical asymptotic theory for Lasso-type penalties to a broader class of losses and models, with meaningful impact for pattern recovery in high-stability, low-dimensional structures.

Abstract

This article investigates the asymptotic distribution of penalized estimators with non-differentiable penalties designed to recover low-dimensional pattern structures. Patterns play a central role in estimation, as they reveal the underlying structure of the parameter -- which coefficients are zero, which are equal, and how they are clustered. The main technical challenge stems from the discontinuous nature of these patterns (such as the sign function in the case of the Lasso penalty), a difficulty not previously addressed in the literature and only recently analyzed for the standard linear model. To overcome this, we extend classical results from empirical process theory for M-estimation by incorporating the distributional behavior of model patterns. We introduce a new mathematical framework for studying pattern convergence of regularized M-estimators. While classical approaches to distributional convergence rely on uniform conditions, our analysis employs a new local condition, stochastic Lipschitz differentiability (SLD), which controls fluctuations of the Taylor remainder. We demonstrate how this framework applies to a broad class of loss functions, covering generalized linear models (e.g., logistic and Poisson regression) and robust regression settings with non-smooth losses such as the Huber and quantile loss.

Asymptotic Distribution of Low-Dimensional Patterns Induced by Non-Differentiable Regularizers under General Loss Functions

TL;DR

This work addresses the distribution of low-dimensional patterns induced by non-differentiable convex penalties in penalized M-estimation with fixed dimension . It introduces stochastic Lipschitz differentiability (SLD) to extend Pollard-type CLTs to pattern convergence, showing that the rescaled estimator converges to the minimizer of with , and that under stronger conditions, the induced pattern converges in distribution and in pattern to . The framework applies to generalized linear models and robust losses (Huber, quantile), providing explicit limiting covariances and enabling model-recovery analysis, including vague clustering and residual-error measures. Simulations in SLOPE-regularized logistic regression confirm convergence of RMSE and pattern recovery to their asymptotic counterparts, illustrating practical implications for pattern identification and stable regularization. Overall, the results unify and extend classical asymptotic theory for Lasso-type penalties to a broader class of losses and models, with meaningful impact for pattern recovery in high-stability, low-dimensional structures.

Abstract

This article investigates the asymptotic distribution of penalized estimators with non-differentiable penalties designed to recover low-dimensional pattern structures. Patterns play a central role in estimation, as they reveal the underlying structure of the parameter -- which coefficients are zero, which are equal, and how they are clustered. The main technical challenge stems from the discontinuous nature of these patterns (such as the sign function in the case of the Lasso penalty), a difficulty not previously addressed in the literature and only recently analyzed for the standard linear model. To overcome this, we extend classical results from empirical process theory for M-estimation by incorporating the distributional behavior of model patterns. We introduce a new mathematical framework for studying pattern convergence of regularized M-estimators. While classical approaches to distributional convergence rely on uniform conditions, our analysis employs a new local condition, stochastic Lipschitz differentiability (SLD), which controls fluctuations of the Taylor remainder. We demonstrate how this framework applies to a broad class of loss functions, covering generalized linear models (e.g., logistic and Poisson regression) and robust regression settings with non-smooth losses such as the Huber and quantile loss.

Paper Structure

This paper contains 24 sections, 9 theorems, 141 equations, 4 figures, 1 table.

Key Result

Lemma 2.5

Assume $G(\theta)$ has a non-singular Hessian $C=\nabla^2\vert_{\theta=\theta_0}G(\theta)$ at its minimizing value $\theta_0$ and $\ell(\cdot,\theta)$ is locally stochastically differentiable at $\theta_0$ (i.e. local stochastic differentiability holds). Then where $\sup_{u\in K}\vert R_n(u)\vert\rightarrow 0$ in probability as $n\rightarrow\infty$, for any compact set $K$. Further, if $\ell(\cdo

Figures (4)

  • Figure 1: Different notions of stochastic differentiability
  • Figure 2: RMSE as a function of $n$ and $\alpha$.
  • Figure 3: Relative Residual Error as a function of $n$ and $\alpha$.
  • Figure 4: Probability of perfect pattern recovery as a function of $n$ and $\alpha$.

Theorems & Definitions (27)

  • Definition 2.1
  • Definition 2.2
  • Definition 2.3
  • Remark 2.4
  • Lemma 2.5
  • proof
  • Definition 3.1
  • Theorem 3.2
  • Theorem 3.3
  • proof : Proof of Theorem \ref{['main pattern robust theorem in distribution']}
  • ...and 17 more