Robust Second-Order Nonconvex Optimization and Its Application to Low Rank Matrix Sensing

Shuyao Li; Yu Cheng; Ilias Diakonikolas; Jelena Diakonikolas; Rong Ge; Stephen J. Wright

Robust Second-Order Nonconvex Optimization and Its Application to Low Rank Matrix Sensing

Shuyao Li, Yu Cheng, Ilias Diakonikolas, Jelena Diakonikolas, Rong Ge, Stephen J. Wright

TL;DR

This work addresses robust optimization in nonconvex stochastic settings with adversarial outliers, focusing on finding approximate second-order stationary points (SOSPs) under strong contamination. It introduces a general framework that leverages dimension-independent robust estimates of gradients and Hessians to guide nonconvex optimization, achieving SOSP guarantees with $n = \widetilde{Ω}(D^2/ε)$ samples. The framework is then specialized to outlier-robust low-rank matrix sensing with Gaussian design, delivering exact recovery in the noiseless case and provable error bounds in the noisy case, with sample complexity $n = \widetilde{O}((d^2 r^2 + d r \log(Γ/ξ))/ε)$. A Statistical Query lower bound is provided to argue that the quadratic dimension dependence in the sample complexity is necessary for efficient SQ algorithms, underscoring a fundamental information–computation tradeoff. Overall, the paper advances robust nonconvex optimization by delivering dimension-independent SOSP guarantees, principled tensor-Hessian robustness, and tight lower bounds, with concrete implications for robust matrix sensing and related nonconvex problems.

Abstract

Finding an approximate second-order stationary point (SOSP) is a well-studied and fundamental problem in stochastic nonconvex optimization with many applications in machine learning. However, this problem is poorly understood in the presence of outliers, limiting the use of existing nonconvex algorithms in adversarial settings. In this paper, we study the problem of finding SOSPs in the strong contamination model, where a constant fraction of datapoints are arbitrarily corrupted. We introduce a general framework for efficiently finding an approximate SOSP with \emph{dimension-independent} accuracy guarantees, using $\widetilde{O}({D^2}/ε)$ samples where $D$ is the ambient dimension and $ε$ is the fraction of corrupted datapoints. As a concrete application of our framework, we apply it to the problem of low rank matrix sensing, developing efficient and provably robust algorithms that can tolerate corruptions in both the sensing matrices and the measurements. In addition, we establish a Statistical Query lower bound providing evidence that the quadratic dependence on $D$ in the sample complexity is necessary for computationally efficient algorithms.

Robust Second-Order Nonconvex Optimization and Its Application to Low Rank Matrix Sensing

TL;DR

samples. The framework is then specialized to outlier-robust low-rank matrix sensing with Gaussian design, delivering exact recovery in the noiseless case and provable error bounds in the noisy case, with sample complexity

. A Statistical Query lower bound is provided to argue that the quadratic dimension dependence in the sample complexity is necessary for efficient SQ algorithms, underscoring a fundamental information–computation tradeoff. Overall, the paper advances robust nonconvex optimization by delivering dimension-independent SOSP guarantees, principled tensor-Hessian robustness, and tight lower bounds, with concrete implications for robust matrix sensing and related nonconvex problems.

Abstract

samples where

is the ambient dimension and

is the fraction of corrupted datapoints. As a concrete application of our framework, we apply it to the problem of low rank matrix sensing, developing efficient and provably robust algorithms that can tolerate corruptions in both the sensing matrices and the measurements. In addition, we establish a Statistical Query lower bound providing evidence that the quadratic dependence on

in the sample complexity is necessary for computationally efficient algorithms.

Paper Structure (36 sections, 32 theorems, 130 equations, 3 algorithms)

This paper contains 36 sections, 32 theorems, 130 equations, 3 algorithms.

Introduction
Our Results and Contributions
Our Techniques
Outlier-robust nonconvex optimization.
Application to low rank matrix sensing.
SQ lower bound.
Roadmap
Preliminaries
A Randomized Algorithm with Inexact Gradients and Hessians.
Robust Mean Estimation.
General Robust Nonconvex Optimization
Low Rank Matrix Sensing Problems
Main results for Robust Low Rank Matrix Sensing
Global Convergence to an Approximate SOSP
Local Linear Convergence
...and 21 more sections

Key Result

Theorem 1.5

Suppose $f$ satisfies Assumption assump:general_clean_data in a region $\mathcal{B}$ with parameters $\sigma_{g}$ and $\sigma_{H}$. Given an arbitrary initial point $x_{0} \in \mathcal{B}$ and an $\epsilon$-corrupted set of $n = \widetilde{\Omega}\!\left(D^{2}/\epsilon\right)$ functions where $D$ is

Theorems & Definitions (69)

Definition 1.1: Strong Contamination Model
Definition 1.2: $\epsilon$-Corrupted Stochastic Optimization
Definition 1.3: Approximate SOSPs
Theorem 1.5: Finding an Outlier-Robust SOSP, informal
Definition 1.6: Outlier-Robust Matrix Sensing
Theorem 1.7: Our Algorithm for Outlier-Robust Matrix Sensing
Definition 2.1: Lipschitz Continuity
Proposition 2.2: li2023randomized
Proposition 2.3: Robust Mean Estimation
Theorem 3.1
...and 59 more

Robust Second-Order Nonconvex Optimization and Its Application to Low Rank Matrix Sensing

TL;DR

Abstract

Robust Second-Order Nonconvex Optimization and Its Application to Low Rank Matrix Sensing

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (69)