Table of Contents
Fetching ...

New M-estimator of the leading principal component

Joni Virta, Una Radojicic, Marko Voutilainen

TL;DR

The paper introduces a non-convex M-estimator for identifying the leading principal component via the objective $f_P(v)=\mathrm{E}( \| X - v\| \| X + v\| - \| X \|^2 )$, providing a nuclear-norm perspective and establishing population and sample behavior under general and elliptically symmetric distributions. It proves existence and, under ellipticity, identifiability of the leading component through a threshold on the top eigenvalue of the spatial sign covariance, and derives a limiting normal distribution for the sample minimizer with asymptotically independent direction and scale components. Computation is addressed with a finite-time-convergent Weiszfeld-type algorithm, including practical initialization and update rules. Simulations corroborate the identifiability thresholds and demonstrate efficiency gains over classical PCA in heavy-tailed settings, highlighting a robust, low-moment alternative to PCA with solid theoretical guarantees and scalable computation.

Abstract

We study the minimization of the non-convex and non-differentiable objective function $v \mapsto \mathrm{E} ( \| X - v \| \| X + v \| - \| X \|^2 )$ in $\mathbb{R}^p$. In particular, we show that its minimizers recover the first principal component direction of elliptically symmetric $X$ under specific conditions. The stringency of these conditions is studied in various scenarios, including a diverging number of variables $p$. We establish the consistency and asymptotic normality of the sample minimizer. We propose a Weiszfeld-type algorithm for optimizing the objective and show that it is guaranteed to converge in a finite number of steps. The results are illustrated with two simulations.

New M-estimator of the leading principal component

TL;DR

The paper introduces a non-convex M-estimator for identifying the leading principal component via the objective , providing a nuclear-norm perspective and establishing population and sample behavior under general and elliptically symmetric distributions. It proves existence and, under ellipticity, identifiability of the leading component through a threshold on the top eigenvalue of the spatial sign covariance, and derives a limiting normal distribution for the sample minimizer with asymptotically independent direction and scale components. Computation is addressed with a finite-time-convergent Weiszfeld-type algorithm, including practical initialization and update rules. Simulations corroborate the identifiability thresholds and demonstrate efficiency gains over classical PCA in heavy-tailed settings, highlighting a robust, low-moment alternative to PCA with solid theoretical guarantees and scalable computation.

Abstract

We study the minimization of the non-convex and non-differentiable objective function in . In particular, we show that its minimizers recover the first principal component direction of elliptically symmetric under specific conditions. The stringency of these conditions is studied in various scenarios, including a diverging number of variables . We establish the consistency and asymptotic normality of the sample minimizer. We propose a Weiszfeld-type algorithm for optimizing the objective and show that it is guaranteed to converge in a finite number of steps. The results are illustrated with two simulations.

Paper Structure

This paper contains 19 sections, 29 theorems, 150 equations, 4 figures, 1 algorithm.

Key Result

Theorem 1

The quantity $f_P(v)$ is finite for all $v \in \mathbb{R}^p$.

Figures (4)

  • Figure 1: The contour plot of $f_P(v)$ as a function of $v$ based on a sample of size $n = 2000$. In the left panel, $X \sim \mathcal{N}_2(0, \mathrm{diag}(3, 1))$, whereas, in the right panel $X \sim \mathcal{N}_3(0, \mathrm{diag}(3, 1, 1))$ and we have restricted to the plane $v = (v_1, v_2, 0).$
  • Figure 2: The average values of $| v_{n1} |$ and $\sqrt{v_{n2}^2 + v_{n3}^2}$ over 1000 replicates for different combinations of sample size $n$, model and $\theta$ when $\mathrm{Cov}(X) = \mathrm{diag}(\theta^2, 1, 1)$. The grey vertical line indicates the identifiability threshold predicted by Theorem \ref{['theo:main_1']} for elliptical distributions.
  • Figure 3: The average values of $| v_{n1} |$ and $\sqrt{v_{n2}^2 + v_{n3}^2}$ over 1000 replicates for different combinations of sample size $n$, model and $\theta$ when $\mathrm{Cov}(X) = \mathrm{diag}(\theta^2, \theta, 1)$. The grey vertical line indicates the identifiability threshold predicted by Theorem \ref{['theo:main_1']} for elliptical distributions.
  • Figure 4: The logarithmic spectral norm of the estimated asymptotic covariance $ASCOV$ of $\sqrt{n}(s_n\tfrac{w_n}{\|w_n\|}-o_1)$, where $w_n/\|w_n\|$ is the unit norm estimator of the leading principal direction based on the proposed approach (SPCA, blue) and PCA (red).

Theorems & Definitions (56)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Lemma 1
  • Theorem 5
  • Theorem 6
  • Theorem 7
  • Theorem 8
  • Lemma 2
  • ...and 46 more