UAdam: Unified Adam-Type Algorithmic Framework for Non-Convex Stochastic Optimization

Yiming Jiang; Jinlan Liu; Dongpo Xu; Danilo P. Mandic

UAdam: Unified Adam-Type Algorithmic Framework for Non-Convex Stochastic Optimization

Yiming Jiang, Jinlan Liu, Dongpo Xu, Danilo P. Mandic

TL;DR

UAdam provides a unified Adam-type framework for non-convex stochastic optimization by combining adaptive learning rates with stochastic momentum updates. It demonstrates convergence to a neighborhood of stationary points at rate $O(1/T)$ under mild assumptions while not restricting the second-order momentum $β_2$, requiring only $β_1$ to be close to 1. The interpolation parameter $λ$ enables immediate transfer of convergence guarantees to Adam-type and NAdam-type algorithms, unifying several existing optimizers such as Adam, AMSGrad, AdaBound, AdaFom, Adan, and NAdam under a single analysis. This framework offers a principled basis for understanding and extending Adam-type methods in practice, with implications for stable convergence in deep learning applications.

Abstract

Adam-type algorithms have become a preferred choice for optimisation in the deep learning setting, however, despite success, their convergence is still not well understood. To this end, we introduce a unified framework for Adam-type algorithms (called UAdam). This is equipped with a general form of the second-order moment, which makes it possible to include Adam and its variants as special cases, such as NAdam, AMSGrad, AdaBound, AdaFom, and Adan. This is supported by a rigorous convergence analysis of UAdam in the non-convex stochastic setting, showing that UAdam converges to the neighborhood of stationary points with the rate of $\mathcal{O}(1/T)$. Furthermore, the size of neighborhood decreases as $β$ increases. Importantly, our analysis only requires the first-order momentum factor to be close enough to 1, without any restrictions on the second-order momentum factor. Theoretical results also show that vanilla Adam can converge by selecting appropriate hyperparameters, which provides a theoretical guarantee for the analysis, applications, and further developments of the whole class of Adam-type algorithms.

UAdam: Unified Adam-Type Algorithmic Framework for Non-Convex Stochastic Optimization

TL;DR

under mild assumptions while not restricting the second-order momentum

, requiring only

to be close to 1. The interpolation parameter

enables immediate transfer of convergence guarantees to Adam-type and NAdam-type algorithms, unifying several existing optimizers such as Adam, AMSGrad, AdaBound, AdaFom, Adan, and NAdam under a single analysis. This framework offers a principled basis for understanding and extending Adam-type methods in practice, with implications for stable convergence in deep learning applications.

Abstract

. Furthermore, the size of neighborhood decreases as

increases. Importantly, our analysis only requires the first-order momentum factor to be close enough to 1, without any restrictions on the second-order momentum factor. Theoretical results also show that vanilla Adam can converge by selecting appropriate hyperparameters, which provides a theoretical guarantee for the analysis, applications, and further developments of the whole class of Adam-type algorithms.

Paper Structure (14 sections, 8 theorems, 65 equations, 2 tables, 1 algorithm)

This paper contains 14 sections, 8 theorems, 65 equations, 2 tables, 1 algorithm.

Introduction
Related work
Contributions
Preliminaries
Unified adaptive stochastic momentum algorithms
Stochastic unified momentum algorithms
Adaptive learning rate
UAdam: Unified adaptive stochastic momentum algorithm
Main results
Technical lemmas
Convergence analysis of UAdam for non-convex optimization
Conclusion
Equivalence form of SNAG
Equivalence relationship of SUM

Key Result

Lemma 4.1

Consider a stochastic exponential moving average sequence, $m_t=\beta_tm_{t-1}+\left(1-\beta_t\right)g_t$, where $0\leq\beta_t<1$. Suppose that Assumptions ass:smooth and ass:unbiased hold. Then,

Theorems & Definitions (24)

Remark 2.1
Remark 3.1
Remark 3.2
Remark 3.3
Remark 3.4
Remark 3.5
Lemma 4.1
proof
Remark 4.1
Lemma 4.2
...and 14 more

UAdam: Unified Adam-Type Algorithmic Framework for Non-Convex Stochastic Optimization

TL;DR

Abstract

UAdam: Unified Adam-Type Algorithmic Framework for Non-Convex Stochastic Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (24)