Algebraic characterization of equivalence between optimization algorithms

Laurent Lessard; Madeleine Udell

Algebraic characterization of equivalence between optimization algorithms

Laurent Lessard, Madeleine Udell

TL;DR

This paper presents a principled framework for determining when iterative optimization algorithms are equivalent by modeling them as linear dynamical systems in feedback with nonlinear oracles. It introduces three notions—oracle equivalence, shift equivalence, and LFT equivalence—each enabling increasingly general forms of equivalence through transfer-function analysis. The framework unifies and explains known equivalences among classical methods (e.g., gradient methods, ADMM, Douglas–Rachford, Chambolle–Pock, proximal methods) and extends to accelerated, distributed, and operator-splitting schemes, including cases with related oracles via Moreau identities. By providing practical procedures for converting algorithms to transfer functions and checking equivalence (including via multi-shifts and LFTs), the work offers a principled toolkit for algorithm design, comparison, and discovery with potential impact on robust convergence and implementation choices.

Abstract

When are two algorithms the same? How can we be sure a recently proposed algorithm is novel, and not a minor twist on an existing method? In this paper, we present a framework for reasoning about equivalence between a broad class of iterative algorithms, with a focus on algorithms designed for convex optimization. We propose several notions of what it means for two algorithms to be equivalent, and provide computationally tractable means to detect equivalence. Our main definition, oracle equivalence, states that two algorithms are equivalent if they result in the same sequence of calls to the function oracles (for suitable initialization). Borrowing from control theory, we use state-space realizations to represent algorithms and characterize algorithm equivalence via transfer functions. Our framework can also identify and characterize equivalence between algorithms that use different oracles that are related via a linear fractional transformation. Prominent examples include linear transformations and function conjugation.

Algebraic characterization of equivalence between optimization algorithms

TL;DR

Abstract

Paper Structure (52 sections, 14 theorems, 81 equations, 7 figures, 28 algorithms)

This paper contains 52 sections, 14 theorems, 81 equations, 7 figures, 28 algorithms.

Introduction
Related work
Motivating examples
Preliminaries
Oracle-based iterative algorithms
State-space form
Oracle normalization
Example: Reflected Gradient Method
Example: simplified ADMM
Explicit and implicit implementations
Algorithm representation
From transfer functions to algorithms
Algorithm equivalence
Oracle equivalence
Invariance under linear state transformations
...and 37 more sections

Key Result

Proposition 1

Suppose system $i\in\{1,2\}$ has state-space realization $(A_i,B_i,C_i,D_i)$, initial state $x_i^0$, and associated transfer function $\hat{H}_i$. The following are equivalent.

Figures (7)

Figure 1: Block diagram representation of a generic iterative algorithm.
Figure 2: Block diagram representation of a generic optimization algorithm expressed in terms of its transfer function $\hat{H}$ and the $z$-transforms of its inputs and outputs.
Figure 3: Block diagrams representing \ref{['algo5']} (left) and \ref{['algo6']} (right). These algorithms are shift equivalent because when suitably initialized, they make the same calls to the oracles, albeit with a time shift. The updates are exactly the same for both algorithms, but using transformed variables.
Figure 4: Commutative diagram visualizing that the shift (delay) operation commutes with the application of the oracle $\phi$. The foreground shows the $z$-transformed versions of the signals, where the shift becomes multiplication by a power of $z^{-1}$.
Figure 5: Equivalent block diagram representing shift equivalence. We use the fact that the oracle $\Phi$ commutes with any multi-shift $\hat{\Delta}_m$. However, $\hat{\Delta}_m$ need not commute with $\hat{H}$, which means equivalent algorithms can have different transfer functions.
...and 2 more figures

Theorems & Definitions (34)

Remark 1
Definition 1
Proposition 1
Proposition 2
Definition 2: oracle equivalence
Definition 3: multi-shift
Definition 4: shift equivalence
Lemma 3
proof
Remark 2
...and 24 more

Algebraic characterization of equivalence between optimization algorithms

TL;DR

Abstract

Algebraic characterization of equivalence between optimization algorithms

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (34)