Table of Contents
Fetching ...

Phase transitions for the existence of unregularized M-estimators in single index models

Takuya Koriyama, Pierre C. Bellec

TL;DR

The paper addresses the existence of unregularized M-estimators for high-dimensional single-index models under proportional asymptotics, deriving an explicit critical threshold $\delta_\infty$ that governs a sharp phase transition. It extends Candès & Sur's phase-transition results beyond binary logistic regression to Poisson and other link models, using a convex-analytic construction to connect estimator existence to a nonlinear system describing asymptotic behavior. A key contribution is proving an equivalence between the nonlinear system and a constrained infinite-dimensional convex optimization in a Hilbert space, ensuring a unique solution exists if and only if $\delta > \delta_\infty$ with a positive Lagrange multiplier; this provides a rigorous foundation for proportional asymptotics analyses. Numerical simulations in Poisson and generalized logistic settings corroborate the theory, showing empirical thresholds that match the theoretical $1/\delta_\infty$ across model variants and confirming the phase transition in estimator existence.

Abstract

This paper studies phase transitions for the existence of unregularized M-estimators under proportional asymptotics where the sample size $n$ and feature dimension $p$ grow proportionally with $n/p \to δ\in (1, \infty)$. We study the existence of M-estimators in single-index models where the response $y_i$ depends on covariates $x_i \sim N(0, I_p)$ through an unknown index ${w} \in \mathbb{R}^p$ and an unknown link function. An explicit expression is derived for the critical threshold $δ_\infty$ that determines the phase transition for the existence of the M-estimator, generalizing the results of Candés & Sur (2020) for binary logistic regression to other single-index models. Furthermore, we investigate the existence of a solution to the nonlinear system of equations governing the asymptotic behavior of the M-estimator when it exists. The existence of solution to this system for $δ> δ_\infty$ remains largely unproven outside the global null in binary logistic regression. We address this gap with a proof that the system admits a solution if and only if $δ> δ_\infty$, providing a comprehensive theoretical foundation for proportional asymptotic results that require as a prerequisite the existence of a solution to the system.

Phase transitions for the existence of unregularized M-estimators in single index models

TL;DR

The paper addresses the existence of unregularized M-estimators for high-dimensional single-index models under proportional asymptotics, deriving an explicit critical threshold that governs a sharp phase transition. It extends Candès & Sur's phase-transition results beyond binary logistic regression to Poisson and other link models, using a convex-analytic construction to connect estimator existence to a nonlinear system describing asymptotic behavior. A key contribution is proving an equivalence between the nonlinear system and a constrained infinite-dimensional convex optimization in a Hilbert space, ensuring a unique solution exists if and only if with a positive Lagrange multiplier; this provides a rigorous foundation for proportional asymptotics analyses. Numerical simulations in Poisson and generalized logistic settings corroborate the theory, showing empirical thresholds that match the theoretical across model variants and confirming the phase transition in estimator existence.

Abstract

This paper studies phase transitions for the existence of unregularized M-estimators under proportional asymptotics where the sample size and feature dimension grow proportionally with . We study the existence of M-estimators in single-index models where the response depends on covariates through an unknown index and an unknown link function. An explicit expression is derived for the critical threshold that determines the phase transition for the existence of the M-estimator, generalizing the results of Candés & Sur (2020) for binary logistic regression to other single-index models. Furthermore, we investigate the existence of a solution to the nonlinear system of equations governing the asymptotic behavior of the M-estimator when it exists. The existence of solution to this system for remains largely unproven outside the global null in binary logistic regression. We address this gap with a proof that the system admits a solution if and only if , providing a comprehensive theoretical foundation for proportional asymptotic results that require as a prerequisite the existence of a solution to the system.
Paper Structure (12 sections, 17 theorems, 149 equations, 3 figures)

This paper contains 12 sections, 17 theorems, 149 equations, 3 figures.

Key Result

Theorem 2.6

As $n, p\to +\infty$ with $n/p\to \delta$, we have

Figures (3)

  • Figure 1: Three examples of loss functions.
  • Figure 2: Count of instances where the minimizer in \ref{['infimum']} exists for varying $p/n$ and signal strength. Simulation parameter: $n=1500$, $20$ repetitions, $\ell_y(u)=e^u-yu$ is the Poisson loss, $y_i\mid \bm{x}_i$ satisfies the Poisson model \ref{['poisson_model']}.
  • Figure 3: Count of instances where the minimizer in \ref{['infimum']} exists for varying $p/n$ and signal strength $\kappa$. Simulation parameter: $n=1000$, $20$ repetitions, $y_i\mid \bm{x}_i \sim$ satisfies the binomial model $\operatorname{Binomial}(q, p_i)$ as in \ref{['eq:q_logistic_model']}.

Theorems & Definitions (31)

  • Example 2.1: Binary logistic regression
  • Example 2.2: Logistic regression with repeated measurements
  • Example 2.3: Poisson regression
  • Theorem 2.6
  • Theorem 2.7
  • Theorem 3.1: Equivalence
  • Lemma 3.2
  • Lemma 3.3
  • Theorem 3.4
  • Lemma 1.1
  • ...and 21 more