Table of Contents
Fetching ...

H-invariance theory: A complete characterization of minimax optimal fixed-point algorithms

TaeHo Yoon, Ernest K. Ryu, Benjamin Grimmer

TL;DR

This paper provides a complete algebraic characterization of minimax optimal fixed-point algorithms for nonexpansive operators via H-invariants and H-certificates. It proves that the Optimal Halpern Method (OHM) achieves the minimax rate $\|y_{N-1}-T y_{N-1}\|^2 \le \frac{4}{N^2}\|y_0-y_\star\|^2$ and that all exact minimax optimal H-matrix algorithms lie in the intersection of level sets of the invariants $P(N-1,m;H)$ with nonnegative certificates $\lambda_{k,j}^\star(H)$, yielding a complete set of optimal methods. The main contributions include explicit formulas for $\lambda^\star(H)$ in terms of H-entries and $Q$-functions, a proof of OHM’s uniqueness as the only anytime optimal algorithm, and constructive pathways to new optimal algorithms with sparse H-certificates, including self-dual and non-dual examples. The theory extends acceleration concepts in first-order optimization by recasting optimality in terms of invariants and linear-algebraic certificates, and it suggests avenues to generalize to other problem classes such as minimax and nonsmooth optimization.

Abstract

For nonexpansive fixed-point problems, Halpern's method with optimal parameters, its so-called H-dual algorithm, and in fact, an infinite family of algorithms containing them, all exhibit the exactly minimax optimal convergence rates. In this work, we provide a characterization of the complete, exhaustive family of distinct algorithms using predetermined step-sizes, represented as lower triangular H-matrices, which attain the same optimal convergence rate. The characterization is based on polynomials in the entries of the H-matrix that we call H-invariants, whose values stay constant over all optimal H-matrices, together with H-certificates, of which nonnegativity precisely specifies the region of optimality within the common level set of H-invariants. The H-invariance theory we present offers a novel view of optimal acceleration in first-order optimization as a mathematical study of carefully selected invariants, certificates, and structures induced by them.

H-invariance theory: A complete characterization of minimax optimal fixed-point algorithms

TL;DR

This paper provides a complete algebraic characterization of minimax optimal fixed-point algorithms for nonexpansive operators via H-invariants and H-certificates. It proves that the Optimal Halpern Method (OHM) achieves the minimax rate and that all exact minimax optimal H-matrix algorithms lie in the intersection of level sets of the invariants with nonnegative certificates , yielding a complete set of optimal methods. The main contributions include explicit formulas for in terms of H-entries and -functions, a proof of OHM’s uniqueness as the only anytime optimal algorithm, and constructive pathways to new optimal algorithms with sparse H-certificates, including self-dual and non-dual examples. The theory extends acceleration concepts in first-order optimization by recasting optimality in terms of invariants and linear-algebraic certificates, and it suggests avenues to generalize to other problem classes such as minimax and nonsmooth optimization.

Abstract

For nonexpansive fixed-point problems, Halpern's method with optimal parameters, its so-called H-dual algorithm, and in fact, an infinite family of algorithms containing them, all exhibit the exactly minimax optimal convergence rates. In this work, we provide a characterization of the complete, exhaustive family of distinct algorithms using predetermined step-sizes, represented as lower triangular H-matrices, which attain the same optimal convergence rate. The characterization is based on polynomials in the entries of the H-matrix that we call H-invariants, whose values stay constant over all optimal H-matrices, together with H-certificates, of which nonnegativity precisely specifies the region of optimality within the common level set of H-invariants. The H-invariance theory we present offers a novel view of optimal acceleration in first-order optimization as a mathematical study of carefully selected invariants, certificates, and structures induced by them.

Paper Structure

This paper contains 40 sections, 30 theorems, 224 equations, 1 figure.

Key Result

Theorem 1

Let $d\ge 2(N-1)$, $y_0 \in \reals^d$ and $R > 0$. For any deterministic first-order algorithm using $N-1$ operator evaluations, there exists a nonexpansive operator $\opT\colon \reals^d \to \reals^d$ with a fixed point $y_\star$ such that $\|y_0 - y_\star\| = R$, and

Figures (1)

  • Figure 1: The optimal H-matrix set $\cH_\star$ for $N=4$, parametrized by $(h_{1,1}, h_{2,1}, h_{2,2})$, has six vertices with sparse $\lambda^\star$.

Theorems & Definitions (50)

  • Theorem 1: ParkRyu2022_exact
  • Theorem 2: Kim2021_acceleratedLieder2021_convergence
  • Theorem 3
  • Definition 1
  • Definition 2
  • Theorem 4
  • proof
  • Lemma 1
  • proof
  • Theorem 5
  • ...and 40 more