Polyak's Heavy Ball Method Achieves Accelerated Local Rate of Convergence under Polyak-Lojasiewicz Inequality

Sebastian Kassing; Simon Weissmann

Polyak's Heavy Ball Method Achieves Accelerated Local Rate of Convergence under Polyak-Lojasiewicz Inequality

Sebastian Kassing, Simon Weissmann

TL;DR

The paper analyzes Polyak's heavy ball method for nonconvex $C^4$ objectives that satisfy the Polyak–Lojasiewicz inequality, showing that local acceleration is achievable in both continuous and discrete time. A differential-geometric PL framework is developed, separating normal and tangential dynamics around the PL manifold to obtain accelerated local rates and optimal parameter choices; in continuous time the rate is governed by $m(\alpha)$, while in discrete time the rate is governed by $m(\gamma,\beta)$ with optimal values achieving $m(\gamma,\beta)=\tfrac{\sqrt{\kappa}-1}{\sqrt{\kappa}+1}$. The analysis relies on a normal-bundle chart and a purely geometric argument rather than Lyapunov functions, and it demonstrates that acceleration persists locally even when global convergence under aggressive momentum fails. Numerical experiments corroborate the theoretical predictions, illustrating the trade-off between faster asymptotic rates and longer burn-in due to entering the local PL regime.

Abstract

In this work, we analyze the convergence of Polyak's heavy ball method in both continuous and discrete time for non-convex $C^4$-objective functions satisfying the Polyak-Lojasiewicz inequality. Under this weak assumption, we recover the asymptotic convergence rates originally derived by Polyak in [Polyak, U.S.S.R. Comput. Math. and Math. Phys., 1964] for strongly convex objectives. Our results demonstrate that the heavy ball method exhibits asymptotic local acceleration on this class of functions. In particular, in the discrete time setting, we prove local convergence of the iterates to a minimum once the method enters a sufficiently small neighborhood of the set of minima, for a broad range of hyperparameters, including aggressive choices for the momentum parameter and the step-size for which global convergence is known to fail. Instead of the usually employed Lyapunov-type arguments, our approach leverages a new differential geometric perspective of the Polyak-Lojasiewicz inequality proposed in [Rebjock and Boumal, Math. Program., 2025].

Polyak's Heavy Ball Method Achieves Accelerated Local Rate of Convergence under Polyak-Lojasiewicz Inequality

TL;DR

The paper analyzes Polyak's heavy ball method for nonconvex

objectives that satisfy the Polyak–Lojasiewicz inequality, showing that local acceleration is achievable in both continuous and discrete time. A differential-geometric PL framework is developed, separating normal and tangential dynamics around the PL manifold to obtain accelerated local rates and optimal parameter choices; in continuous time the rate is governed by

, while in discrete time the rate is governed by

with optimal values achieving

. The analysis relies on a normal-bundle chart and a purely geometric argument rather than Lyapunov functions, and it demonstrates that acceleration persists locally even when global convergence under aggressive momentum fails. Numerical experiments corroborate the theoretical predictions, illustrating the trade-off between faster asymptotic rates and longer burn-in due to entering the local PL regime.

Abstract

In this work, we analyze the convergence of Polyak's heavy ball method in both continuous and discrete time for non-convex

-objective functions satisfying the Polyak-Lojasiewicz inequality. Under this weak assumption, we recover the asymptotic convergence rates originally derived by Polyak in [Polyak, U.S.S.R. Comput. Math. and Math. Phys., 1964] for strongly convex objectives. Our results demonstrate that the heavy ball method exhibits asymptotic local acceleration on this class of functions. In particular, in the discrete time setting, we prove local convergence of the iterates to a minimum once the method enters a sufficiently small neighborhood of the set of minima, for a broad range of hyperparameters, including aggressive choices for the momentum parameter and the step-size for which global convergence is known to fail. Instead of the usually employed Lyapunov-type arguments, our approach leverages a new differential geometric perspective of the Polyak-Lojasiewicz inequality proposed in [Rebjock and Boumal, Math. Program., 2025].

Polyak's Heavy Ball Method Achieves Accelerated Local Rate of Convergence under Polyak-Lojasiewicz Inequality

TL;DR

Abstract

Polyak's Heavy Ball Method Achieves Accelerated Local Rate of Convergence under Polyak-Lojasiewicz Inequality

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (24)