Table of Contents
Fetching ...

Formalization of Complexity Analysis of the First-order Algorithms for Convex Optimization

Chenyi Li, Ziyu Wang, Wanyi He, Yuxuan Wu, Shengyang Xu, Zaiwen Wen

TL;DR

This work formalizes the convergence analysis of first-order convex optimization methods within Lean4 and mathlib4, introducing formal definitions for the gradient $\nabla f$, subgradient $\partial f$, and proximal operator $\operatorname{prox}$ on Hilbert spaces. It develops Lean implementations for the gradient, subgradient, and proximal gradient methods and proves their convergence rates, including the $O\big(\frac{1}{k}\big)$ rate for convex gradient descent and the $O\big(\frac{1}{k^2}\big)$ rate for Nesterov acceleration, with explicit results for subgradient and proximal variants and a concrete Lasso application. The authors extend convex analysis with first-order conditions, monotonicity, and Moreau-Rockafellar-type frameworks and adopt a modular class-based approach to enable instantiation to concrete problems, illustrating the pathway toward broader numerical-optimization formalization in verification and technical computing. This work provides a blueprint for extending formalization to algorithms such as ADMM and BCD, bridging formal methods with practical convex optimization and sparse recovery problems like Lasso.

Abstract

The convergence rate of various first-order optimization algorithms is a pivotal concern within the numerical optimization community, as it directly reflects the efficiency of these algorithms across different optimization problems. Our goal is making a significant step forward in the formal mathematical representation of optimization techniques using the Lean4 theorem prover. We first formalize the gradient for smooth functions and the subgradient for convex functions on a Hilbert space, laying the groundwork for the accurate formalization of algorithmic structures. Then, we extend our contribution by proving several properties of differentiable convex functions that have not yet been formalized in Mathlib. Finally, a comprehensive formalization of these algorithms is presented. These developments are not only noteworthy on their own but also serve as essential precursors to the formalization of a broader spectrum of numerical algorithms and their applications in machine learning as well as many other areas.

Formalization of Complexity Analysis of the First-order Algorithms for Convex Optimization

TL;DR

This work formalizes the convergence analysis of first-order convex optimization methods within Lean4 and mathlib4, introducing formal definitions for the gradient , subgradient , and proximal operator on Hilbert spaces. It develops Lean implementations for the gradient, subgradient, and proximal gradient methods and proves their convergence rates, including the rate for convex gradient descent and the rate for Nesterov acceleration, with explicit results for subgradient and proximal variants and a concrete Lasso application. The authors extend convex analysis with first-order conditions, monotonicity, and Moreau-Rockafellar-type frameworks and adopt a modular class-based approach to enable instantiation to concrete problems, illustrating the pathway toward broader numerical-optimization formalization in verification and technical computing. This work provides a blueprint for extending formalization to algorithms such as ADMM and BCD, bridging formal methods with practical convex optimization and sparse recovery problems like Lasso.

Abstract

The convergence rate of various first-order optimization algorithms is a pivotal concern within the numerical optimization community, as it directly reflects the efficiency of these algorithms across different optimization problems. Our goal is making a significant step forward in the formal mathematical representation of optimization techniques using the Lean4 theorem prover. We first formalize the gradient for smooth functions and the subgradient for convex functions on a Hilbert space, laying the groundwork for the accurate formalization of algorithmic structures. Then, we extend our contribution by proving several properties of differentiable convex functions that have not yet been formalized in Mathlib. Finally, a comprehensive formalization of these algorithms is presented. These developments are not only noteworthy on their own but also serve as essential precursors to the formalization of a broader spectrum of numerical algorithms and their applications in machine learning as well as many other areas.
Paper Structure (22 sections, 13 theorems, 16 equations)

This paper contains 22 sections, 13 theorems, 16 equations.

Key Result

Theorem 1

Assume $f_1$ and $f_2$ are two convex functions define on $E$, then we have for any $x \in E$

Theorems & Definitions (17)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Lemma 1
  • Theorem 4
  • Lemma 2
  • ...and 7 more