High-order Accumulative Regularization for Gradient Minimization in Convex Programming
Yao Ji, Guanghui Lan
TL;DR
This paper addresses the mismatch between rapid decreases in function value and slower reductions in gradient norms for high-order convex optimization methods. It introduces a unified high-order accumulative regularization (AR) framework that leverages fast function-value progress to accelerate gradient-norm convergence, including parameter-free and inexact variants with restart schemes. The key contributions include a third-order AR method that matches the function-value rate for gradient minimization, a general AR framework for structured convex problems with various subroutines, and extensions to uniformly convex settings with linear, superlinear, or sublinear rates, plus parameter-free implementations. Overall, the framework broadens fast gradient minimization across smoothness and curvature regimes and recovers first-order results as special cases, offering practical, robust optimization tools for convex and uniformly convex objectives.
Abstract
This paper develops a unified high-order accumulative regularization (AR) framework for convex and uniformly convex gradient norm minimization. Existing high-order methods often exhibit a gap: the function-value residual decreases fast, while the gradient norm converges much slower. To close this gap, we introduce AR that systematically transforms the fast function-value residual convergence rate into a fast (matching) gradient norm convergence rate. Specifically, for composite convex problems, to compute an approximate solution such that the norm of its (sub)gradient does not exceed $\varepsilon,$ the proposed AR methods match the best corresponding convergence rate for the function-value residual. We further extend the framework to uniformly convex settings, establishing linear, superlinear, and sublinear convergence of the gradient norm under different lower curvature conditions. Moreover, we design parameter-free algorithms that require no input of problem parameters, e.g., the Lipschitz constant of the $p$-th-order gradient, the initial optimality gap and the uniform convexity parameter, and allow an inexact solution for each high-order step. To the best of our knowledge, no parameter-free methods can attain such a fast gradient norm convergence rate which matches that of the function-value residual in the convex case, and no such parameter-free methods for uniformly convex problems exist. These results substantially generalize existing parameter-free and inexact high-order methods and recover first-order algorithms as special cases, providing a unified approach for fast gradient minimization across a broad range of smoothness and curvature regimes.
