Table of Contents
Fetching ...

When Are Nonconvex Problems Not Scary?

Ju Sun, Qing Qu, John Wright

TL;DR

The paper identifies a tractable class of smooth nonconvex optimization problems—characterized by ridable saddles where every saddle possesses a negative curvature direction and all local minima are global. It proposes a second-order trust-region algorithm on Riemannian manifolds that provably converges to a global minimizer from any initialization by exploiting curvature information. The authors illustrate the framework on canonical problems such as dictionary learning, generalized phase retrieval, orthogonal tensor decomposition, and phase synchronization, demonstrating that these tasks yield X-functions with ridable saddles. They discuss practical extensions, open questions, and the potential for saddle-escaping techniques to complement or replace initialization-based methods in broader nonconvex settings, including deep networks.

Abstract

In this note, we focus on smooth nonconvex optimization problems that obey: (1) all local minimizers are also global; and (2) around any saddle point or local maximizer, the objective has a negative directional curvature. Concrete applications such as dictionary learning, generalized phase retrieval, and orthogonal tensor decomposition are known to induce such structures. We describe a second-order trust-region algorithm that provably converges to a global minimizer efficiently, without special initializations. Finally we highlight alternatives, and open problems in this direction.

When Are Nonconvex Problems Not Scary?

TL;DR

The paper identifies a tractable class of smooth nonconvex optimization problems—characterized by ridable saddles where every saddle possesses a negative curvature direction and all local minima are global. It proposes a second-order trust-region algorithm on Riemannian manifolds that provably converges to a global minimizer from any initialization by exploiting curvature information. The authors illustrate the framework on canonical problems such as dictionary learning, generalized phase retrieval, orthogonal tensor decomposition, and phase synchronization, demonstrating that these tasks yield X-functions with ridable saddles. They discuss practical extensions, open questions, and the potential for saddle-escaping techniques to complement or replace initialization-based methods in broader nonconvex settings, including deep networks.

Abstract

In this note, we focus on smooth nonconvex optimization problems that obey: (1) all local minimizers are also global; and (2) around any saddle point or local maximizer, the objective has a negative directional curvature. Concrete applications such as dictionary learning, generalized phase retrieval, and orthogonal tensor decomposition are known to induce such structures. We describe a second-order trust-region algorithm that provably converges to a global minimizer efficiently, without special initializations. Finally we highlight alternatives, and open problems in this direction.

Paper Structure

This paper contains 4 sections, 7 equations, 1 figure.

Figures (1)

  • Figure 1: Illustrations of the tangent space $T_{\boldsymbol q}\mathbb S^{n-1}$ and exponential map $\exp_{\boldsymbol q} \left( \boldsymbol \delta \right)$ defined on the sphere $\mathbb S^{n-1}$.

Theorems & Definitions (1)

  • Definition 2.1: ($\alpha, \beta, \gamma, \delta$)-$\mathcal{X}$ functions