New M-estimator of the leading principal component

Joni Virta; Una Radojicic; Marko Voutilainen

New M-estimator of the leading principal component

Joni Virta, Una Radojicic, Marko Voutilainen

TL;DR

The paper introduces a non-convex M-estimator for identifying the leading principal component via the objective $f_P(v)=\mathrm{E}( \| X - v\| \| X + v\| - \| X \|^2 )$, providing a nuclear-norm perspective and establishing population and sample behavior under general and elliptically symmetric distributions. It proves existence and, under ellipticity, identifiability of the leading component through a threshold on the top eigenvalue of the spatial sign covariance, and derives a limiting normal distribution for the sample minimizer with asymptotically independent direction and scale components. Computation is addressed with a finite-time-convergent Weiszfeld-type algorithm, including practical initialization and update rules. Simulations corroborate the identifiability thresholds and demonstrate efficiency gains over classical PCA in heavy-tailed settings, highlighting a robust, low-moment alternative to PCA with solid theoretical guarantees and scalable computation.

Abstract

We study the minimization of the non-convex and non-differentiable objective function $v \mapsto \mathrm{E} ( \| X - v \| \| X + v \| - \| X \|^2 )$ in $\mathbb{R}^p$. In particular, we show that its minimizers recover the first principal component direction of elliptically symmetric $X$ under specific conditions. The stringency of these conditions is studied in various scenarios, including a diverging number of variables $p$. We establish the consistency and asymptotic normality of the sample minimizer. We propose a Weiszfeld-type algorithm for optimizing the objective and show that it is guaranteed to converge in a finite number of steps. The results are illustrated with two simulations.

New M-estimator of the leading principal component

TL;DR

The paper introduces a non-convex M-estimator for identifying the leading principal component via the objective

, providing a nuclear-norm perspective and establishing population and sample behavior under general and elliptically symmetric distributions. It proves existence and, under ellipticity, identifiability of the leading component through a threshold on the top eigenvalue of the spatial sign covariance, and derives a limiting normal distribution for the sample minimizer with asymptotically independent direction and scale components. Computation is addressed with a finite-time-convergent Weiszfeld-type algorithm, including practical initialization and update rules. Simulations corroborate the identifiability thresholds and demonstrate efficiency gains over classical PCA in heavy-tailed settings, highlighting a robust, low-moment alternative to PCA with solid theoretical guarantees and scalable computation.

Abstract

We study the minimization of the non-convex and non-differentiable objective function

. In particular, we show that its minimizers recover the first principal component direction of elliptically symmetric

under specific conditions. The stringency of these conditions is studied in various scenarios, including a diverging number of variables

. We establish the consistency and asymptotic normality of the sample minimizer. We propose a Weiszfeld-type algorithm for optimizing the objective and show that it is guaranteed to converge in a finite number of steps. The results are illustrated with two simulations.

New M-estimator of the leading principal component

TL;DR

Abstract

New M-estimator of the leading principal component

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (56)