Avoidance of non-strict saddle points by blow-up

El Mehdi Achour; Umberto L. Hryniewicz; Michael Westdickenberg

Avoidance of non-strict saddle points by blow-up

El Mehdi Achour, Umberto L. Hryniewicz, Michael Westdickenberg

TL;DR

The paper addresses the problem of gradient-flow trajectories avoiding nonstrict saddle points in nonconvex optimization. It introduces a blow-up technique that nonlinearly rescales the gradient near a saddle, lifting the local geometry to a blown-up sphere and enabling a center-stable manifold analysis. By analyzing the blown-up vector field and its spectrum, it derives conditions under which almost all trajectories avoid convergence to the saddle, with a measure-zero stable set. An explicit example illustrates the method, and the framework can be iterated for finer tests when initial tests are inconclusive. Overall, the approach provides concrete criteria and a structural, geometric understanding of gradient-flow avoidance in degenerate settings, with potential implications for optimization dynamics in high dimensions.

Abstract

It is an old idea to use gradient flows or time-discretized variants thereof as methods for solving minimization problems. In some applications, for example in machine learning contexts, it is important to know that for generic initial data, gradient flow trajectories do not get stuck at saddle points. There are classical results concerned with the nondegenerate situation. But if the Hessian of the objective function has a nontrivial kernel at the critical point, then these results are inconclusive. In this paper, we show how relevant information can be extracted by ``blowing up'' the objective function around the non-strict saddle point, i.e., by a suitable nonlinear rescaling that makes the higher order geometry visible. Then the center-stable manifold theorem of dynamical system theory can be applied.

Avoidance of non-strict saddle points by blow-up

TL;DR

Abstract

Avoidance of non-strict saddle points by blow-up

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (23)