Generalized Exponentiated Gradient Algorithms Using the Euler Two-Parameter Logarithm
Andrzej Cichocki
TL;DR
This work addresses the rigidity of traditional exponentiated gradient methods by introducing a flexible Generalized Exponentiated Gradient (GEG) framework that leverages Mirror Descent with the two-parameter Euler $(a,b)$-logarithm as the link function. By deriving both unnormalized and normalized updates, and extending to bipolar weights, the method adapts to a wide range of data geometries through the hyperparameters $(a,b)$ and a learning rate, via deformed logarithms and exponentials. The approach connects to a broad family of entropies (Amari, Tsallis, Abe, KLS) and provides practical update rules that reduce to standard EG in special cases, with potential benefits for sparsity, robustness, and nonnegative/ simplex-constrained optimization. The paper lays out the theoretical building blocks and outlines future work on convergence analysis and hyperparameter optimization for real-world AI applications.
Abstract
IIn this paper we propose and investigate a new class of Generalized Exponentiated Gradient (GEG) algorithms using Mirror Descent (MD) updates, and applying the Bregman divergence with a two--parameter deformation of the logarithm as a link function. This link function (referred here to as the Euler logarithm) is associated with a relatively wide class of trace--form entropies. In order to derive novel GEG/MD updates, we estimate a deformed exponential function, which closely approximates the inverse of the Euler two--parameter deformed logarithm. The characteristic shape and properties of the Euler logarithm and its inverse--deformed exponential functions, are tuned by two hyperparameters. By learning these hyperparameters, we can adapt to the distribution of training data and adjust them to achieve desired properties of gradient descent algorithms. In the literature, there exist nowadays more than fifty mathematically well-established entropic functionals and associated deformed logarithms, so it is impossible to investigate all of them in one research paper. Therefore, we focus here on a class of trace-form entropies and the associated deformed two--parameters logarithms.
