Table of Contents
Fetching ...

High-order Accumulative Regularization for Gradient Minimization in Convex Programming

Yao Ji, Guanghui Lan

TL;DR

This paper addresses the mismatch between rapid decreases in function value and slower reductions in gradient norms for high-order convex optimization methods. It introduces a unified high-order accumulative regularization (AR) framework that leverages fast function-value progress to accelerate gradient-norm convergence, including parameter-free and inexact variants with restart schemes. The key contributions include a third-order AR method that matches the function-value rate for gradient minimization, a general AR framework for structured convex problems with various subroutines, and extensions to uniformly convex settings with linear, superlinear, or sublinear rates, plus parameter-free implementations. Overall, the framework broadens fast gradient minimization across smoothness and curvature regimes and recovers first-order results as special cases, offering practical, robust optimization tools for convex and uniformly convex objectives.

Abstract

This paper develops a unified high-order accumulative regularization (AR) framework for convex and uniformly convex gradient norm minimization. Existing high-order methods often exhibit a gap: the function-value residual decreases fast, while the gradient norm converges much slower. To close this gap, we introduce AR that systematically transforms the fast function-value residual convergence rate into a fast (matching) gradient norm convergence rate. Specifically, for composite convex problems, to compute an approximate solution such that the norm of its (sub)gradient does not exceed $\varepsilon,$ the proposed AR methods match the best corresponding convergence rate for the function-value residual. We further extend the framework to uniformly convex settings, establishing linear, superlinear, and sublinear convergence of the gradient norm under different lower curvature conditions. Moreover, we design parameter-free algorithms that require no input of problem parameters, e.g., the Lipschitz constant of the $p$-th-order gradient, the initial optimality gap and the uniform convexity parameter, and allow an inexact solution for each high-order step. To the best of our knowledge, no parameter-free methods can attain such a fast gradient norm convergence rate which matches that of the function-value residual in the convex case, and no such parameter-free methods for uniformly convex problems exist. These results substantially generalize existing parameter-free and inexact high-order methods and recover first-order algorithms as special cases, providing a unified approach for fast gradient minimization across a broad range of smoothness and curvature regimes.

High-order Accumulative Regularization for Gradient Minimization in Convex Programming

TL;DR

This paper addresses the mismatch between rapid decreases in function value and slower reductions in gradient norms for high-order convex optimization methods. It introduces a unified high-order accumulative regularization (AR) framework that leverages fast function-value progress to accelerate gradient-norm convergence, including parameter-free and inexact variants with restart schemes. The key contributions include a third-order AR method that matches the function-value rate for gradient minimization, a general AR framework for structured convex problems with various subroutines, and extensions to uniformly convex settings with linear, superlinear, or sublinear rates, plus parameter-free implementations. Overall, the framework broadens fast gradient minimization across smoothness and curvature regimes and recovers first-order results as special cases, offering practical, robust optimization tools for convex and uniformly convex objectives.

Abstract

This paper develops a unified high-order accumulative regularization (AR) framework for convex and uniformly convex gradient norm minimization. Existing high-order methods often exhibit a gap: the function-value residual decreases fast, while the gradient norm converges much slower. To close this gap, we introduce AR that systematically transforms the fast function-value residual convergence rate into a fast (matching) gradient norm convergence rate. Specifically, for composite convex problems, to compute an approximate solution such that the norm of its (sub)gradient does not exceed the proposed AR methods match the best corresponding convergence rate for the function-value residual. We further extend the framework to uniformly convex settings, establishing linear, superlinear, and sublinear convergence of the gradient norm under different lower curvature conditions. Moreover, we design parameter-free algorithms that require no input of problem parameters, e.g., the Lipschitz constant of the -th-order gradient, the initial optimality gap and the uniform convexity parameter, and allow an inexact solution for each high-order step. To the best of our knowledge, no parameter-free methods can attain such a fast gradient norm convergence rate which matches that of the function-value residual in the convex case, and no such parameter-free methods for uniformly convex problems exist. These results substantially generalize existing parameter-free and inexact high-order methods and recover first-order algorithms as special cases, providing a unified approach for fast gradient minimization across a broad range of smoothness and curvature regimes.

Paper Structure

This paper contains 12 sections, 15 theorems, 108 equations, 7 algorithms.

Key Result

Lemma 1

Let the sequence $\{x_k\}^{\infty}_{k=1}$ be generated by alg:inner-loop with the parameters $M=2L_3(f), C={12L_3(f)}/{(\sqrt{2}-1)^2}, a_k={(k+1)(k+2)}/{2}, A_1=1,$ then, for any $k\geq 1,$ we have where $x^*$ is an optimal solution to the problem eqn:main.

Theorems & Definitions (27)

  • Lemma 1
  • Proposition 2
  • Proof 1
  • Lemma 1
  • Lemma 2
  • Proof 2
  • Lemma 3
  • Proof 3
  • Theorem 4
  • Proof 4
  • ...and 17 more