Table of Contents
Fetching ...

Proportional asymptotics of piecewise exponential proportional hazards models

Emanuele Massa

TL;DR

The paper tackles high-dimensional survival analysis by studying a piecewise exponential proportional hazards model with $p_n=\zeta n$ covariates under Gaussian design. It employs the Convex Gaussian Min-Max Theorem to prove that the ridge-penalized log-likelihood converges to a low-dimensional saddle point characterized by Replica Symmetric equations, enabling precise predictions of training and test metrics. This yields a rigorous bridge between exact-asymptotics methods and heuristic replica-based insights for Cox-like models, and provides a practical surrogate for predicting performance via the Moreau envelope. The results illuminate how ridge regularization impacts estimation and prediction in the proportional hazards setting and establish a framework for further rigorous analysis in semi-parametric survival models with large $p$.

Abstract

We study the flexible piecewise exponential model in a high dimensional setting where the number of covariates $p$ grows proportionally to the number of observations $n$ and under the hypothesis of random uncorrelated Gaussian designs. We prove rigorously that the optimal ridge penalized log-likelihood of the model converges in probability to the saddle point of a surrogate objective function. The technique of proof is the Convex Gaussian Min-Max theorem of Thrampoulidis, Oymak and Hassibi. An important consequence of this result, is that we can study the impact of the ridge regularization on the estimates of the parameter of the model and the prediction error as a function of the ratio $p/n > 0$. Furthermore, these results represent a first step toward rigorously proving the (conjectured) correctness of several results obtained with the heuristic replica method for the Cox semi-parametric model.

Proportional asymptotics of piecewise exponential proportional hazards models

TL;DR

The paper tackles high-dimensional survival analysis by studying a piecewise exponential proportional hazards model with covariates under Gaussian design. It employs the Convex Gaussian Min-Max Theorem to prove that the ridge-penalized log-likelihood converges to a low-dimensional saddle point characterized by Replica Symmetric equations, enabling precise predictions of training and test metrics. This yields a rigorous bridge between exact-asymptotics methods and heuristic replica-based insights for Cox-like models, and provides a practical surrogate for predicting performance via the Moreau envelope. The results illuminate how ridge regularization impacts estimation and prediction in the proportional hazards setting and establish a framework for further rigorous analysis in semi-parametric survival models with large .

Abstract

We study the flexible piecewise exponential model in a high dimensional setting where the number of covariates grows proportionally to the number of observations and under the hypothesis of random uncorrelated Gaussian designs. We prove rigorously that the optimal ridge penalized log-likelihood of the model converges in probability to the saddle point of a surrogate objective function. The technique of proof is the Convex Gaussian Min-Max theorem of Thrampoulidis, Oymak and Hassibi. An important consequence of this result, is that we can study the impact of the ridge regularization on the estimates of the parameter of the model and the prediction error as a function of the ratio . Furthermore, these results represent a first step toward rigorously proving the (conjectured) correctness of several results obtained with the heuristic replica method for the Cox semi-parametric model.

Paper Structure

This paper contains 15 sections, 19 theorems, 98 equations, 2 figures.

Key Result

Theorem 1

Let $\zeta := \lim_{n\rightarrow \infty} p(n)/n$, $\zeta \in \mathbb{R}^+$ and assume that the data are generated as in (law, gen_proc). Then, for any $\epsilon >0$, where with $Z_0,Q \sim \mathcal{N}(0,1)$, $Z_0\perp Q$.

Figures (2)

  • Figure 1: Simulated data (markers and error bars) against the theory (solid lines). Figures (\ref{['fig:1a']},\ref{['fig:1b']}) show the value of $\hat{w}_n$ and $\hat{v}_n$ defined in (\ref{['w_n']}, \ref{['v_n']}) along a regularization path for $\eta \in (0.1, 6)$ with $\alpha = 0.01$.
  • Figure 2: Simulated data (markers and error bars) against the theory (solid lines): (left) the Test c-index; (right) $R_{IBS}$ as defined in (\ref{['def : R_ibs']}), along a regularization path for $\eta \in (0.1, 6)$ with $\alpha = 0.01$. The error bars for the figure (\ref{['fig:2a']}) have been removed to aid visualization.

Theorems & Definitions (33)

  • Definition 1: Moreau envelope and proximal operator
  • Theorem 1: ASYMPTOTICALLY EQUIVALENT SCALAR OPTIMIZATION PROBLEM
  • Theorem 2: REPLICA SYMMETRIC EQUATIONS
  • Corollary 1: SURROGATE FOR OUR OF SAMPLE LINEAR PREDICTOR
  • proof
  • Theorem 3: CONVEX GAUSSIAN MIN-MAX THEOREM
  • Proposition 1: INTEGRABILITY
  • proof
  • Proposition 2: FINITE VARIANCE
  • proof
  • ...and 23 more