Table of Contents
Fetching ...

Formalization of Algorithms for Optimization with Block Structures

Chenyi Li, Zichen Wang, Yifan Bai, Yunxi Duan, Yuqing Gao, Pengfei Hao, Zaiwen Wen

TL;DR

The paper presents a Lean4-based formalization of convergence analyses for block-structured optimization algorithms, specifically block coordinate descent (BCD) and the alternating direction method of multipliers (ADMM). It builds a rigorous framework for nonsmooth and nonconvex optimization, formalizing subdifferentials and the KL property, and encodes the update schemes and convergence proofs for both algorithms. Key contributions include a provably coherent formalization of the subdifferential calculus, KL-based convergence, and structured Lean representations of proximal and augmented-Lagrangian updates. The work enables machine-verified convergence analysis of a broad class of block-structured optimization methods and lays a foundation for extending the approach to more general algorithms and problem settings.

Abstract

Block-structured problems are central to advances in numerical optimization and machine learning. This paper provides the formalization of convergence analysis for two pivotal algorithms in such settings: the block coordinate descent (BCD) method and the alternating direction method of multipliers (ADMM). Utilizing the type-theory-based proof assistant Lean4, we develop a rigorous framework to formally represent these algorithms. Essential concepts in nonsmooth and nonconvex optimization are formalized, notably subdifferentials, which extend the classical differentiability to handle nonsmooth scenarios, and the Kurdyka-Lojasiewicz (KL) property, which provides essential tools to analyze convergence in nonconvex settings. Such definitions and properties are crucial for the corresponding convergence analyses. We formalize the convergence proofs of these algorithms, demonstrating that our definitions and structures are coherent and robust. These formalizations lay a basis for analyzing the convergence of more general optimization algorithms.

Formalization of Algorithms for Optimization with Block Structures

TL;DR

The paper presents a Lean4-based formalization of convergence analyses for block-structured optimization algorithms, specifically block coordinate descent (BCD) and the alternating direction method of multipliers (ADMM). It builds a rigorous framework for nonsmooth and nonconvex optimization, formalizing subdifferentials and the KL property, and encodes the update schemes and convergence proofs for both algorithms. Key contributions include a provably coherent formalization of the subdifferential calculus, KL-based convergence, and structured Lean representations of proximal and augmented-Lagrangian updates. The work enables machine-verified convergence analysis of a broad class of block-structured optimization methods and lays a foundation for extending the approach to more general algorithms and problem settings.

Abstract

Block-structured problems are central to advances in numerical optimization and machine learning. This paper provides the formalization of convergence analysis for two pivotal algorithms in such settings: the block coordinate descent (BCD) method and the alternating direction method of multipliers (ADMM). Utilizing the type-theory-based proof assistant Lean4, we develop a rigorous framework to formally represent these algorithms. Essential concepts in nonsmooth and nonconvex optimization are formalized, notably subdifferentials, which extend the classical differentiability to handle nonsmooth scenarios, and the Kurdyka-Lojasiewicz (KL) property, which provides essential tools to analyze convergence in nonconvex settings. Such definitions and properties are crucial for the corresponding convergence analyses. We formalize the convergence proofs of these algorithms, demonstrating that our definitions and structures are coherent and robust. These formalizations lay a basis for analyzing the convergence of more general optimization algorithms.

Paper Structure

This paper contains 12 sections, 16 theorems, 23 equations.

Key Result

Theorem 3

For a given $x \in \operatorname{dom} f$, $u \in \hat{\partial} f(x)$ if and only if for any $\varepsilon > 0$ and every $y$ in the neighborhood of x, it holds that

Theorems & Definitions (21)

  • Definition 1
  • Definition 2
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • Theorem 6
  • Theorem 7
  • Theorem 8
  • Theorem 9
  • Definition 10
  • ...and 11 more