Table of Contents
Fetching ...

Generalized Exponentiated Gradient Algorithms Using the Euler Two-Parameter Logarithm

Andrzej Cichocki

TL;DR

This work addresses the rigidity of traditional exponentiated gradient methods by introducing a flexible Generalized Exponentiated Gradient (GEG) framework that leverages Mirror Descent with the two-parameter Euler $(a,b)$-logarithm as the link function. By deriving both unnormalized and normalized updates, and extending to bipolar weights, the method adapts to a wide range of data geometries through the hyperparameters $(a,b)$ and a learning rate, via deformed logarithms and exponentials. The approach connects to a broad family of entropies (Amari, Tsallis, Abe, KLS) and provides practical update rules that reduce to standard EG in special cases, with potential benefits for sparsity, robustness, and nonnegative/ simplex-constrained optimization. The paper lays out the theoretical building blocks and outlines future work on convergence analysis and hyperparameter optimization for real-world AI applications.

Abstract

IIn this paper we propose and investigate a new class of Generalized Exponentiated Gradient (GEG) algorithms using Mirror Descent (MD) updates, and applying the Bregman divergence with a two--parameter deformation of the logarithm as a link function. This link function (referred here to as the Euler logarithm) is associated with a relatively wide class of trace--form entropies. In order to derive novel GEG/MD updates, we estimate a deformed exponential function, which closely approximates the inverse of the Euler two--parameter deformed logarithm. The characteristic shape and properties of the Euler logarithm and its inverse--deformed exponential functions, are tuned by two hyperparameters. By learning these hyperparameters, we can adapt to the distribution of training data and adjust them to achieve desired properties of gradient descent algorithms. In the literature, there exist nowadays more than fifty mathematically well-established entropic functionals and associated deformed logarithms, so it is impossible to investigate all of them in one research paper. Therefore, we focus here on a class of trace-form entropies and the associated deformed two--parameters logarithms.

Generalized Exponentiated Gradient Algorithms Using the Euler Two-Parameter Logarithm

TL;DR

This work addresses the rigidity of traditional exponentiated gradient methods by introducing a flexible Generalized Exponentiated Gradient (GEG) framework that leverages Mirror Descent with the two-parameter Euler -logarithm as the link function. By deriving both unnormalized and normalized updates, and extending to bipolar weights, the method adapts to a wide range of data geometries through the hyperparameters and a learning rate, via deformed logarithms and exponentials. The approach connects to a broad family of entropies (Amari, Tsallis, Abe, KLS) and provides practical update rules that reduce to standard EG in special cases, with potential benefits for sparsity, robustness, and nonnegative/ simplex-constrained optimization. The paper lays out the theoretical building blocks and outlines future work on convergence analysis and hyperparameter optimization for real-world AI applications.

Abstract

IIn this paper we propose and investigate a new class of Generalized Exponentiated Gradient (GEG) algorithms using Mirror Descent (MD) updates, and applying the Bregman divergence with a two--parameter deformation of the logarithm as a link function. This link function (referred here to as the Euler logarithm) is associated with a relatively wide class of trace--form entropies. In order to derive novel GEG/MD updates, we estimate a deformed exponential function, which closely approximates the inverse of the Euler two--parameter deformed logarithm. The characteristic shape and properties of the Euler logarithm and its inverse--deformed exponential functions, are tuned by two hyperparameters. By learning these hyperparameters, we can adapt to the distribution of training data and adjust them to achieve desired properties of gradient descent algorithms. In the literature, there exist nowadays more than fifty mathematically well-established entropic functionals and associated deformed logarithms, so it is impossible to investigate all of them in one research paper. Therefore, we focus here on a class of trace-form entropies and the associated deformed two--parameters logarithms.

Paper Structure

This paper contains 9 sections, 49 equations, 3 figures.

Figures (3)

  • Figure 1: Surface plots of the Euler $(a,b)$-logarithm for various values of hyperparameters $a$ and $b$. These figures illustrate the $(a,b)$-logarithm in terms of $b$ and $x$ for fixed $a=-0.3$ and $a=-1.1$.
  • Figure 2: Surface plots of the Kaniadakis-Scarfone $(\kappa,\lambda)$-logarithm for various values of hyperparameters $\lambda$ and $\kappa$. These figures illustrate the $(\lambda,\kappa)$-logarithm in terms of $\alpha$ and $x$ for fixed $\lambda=0.7$. The black continuous line represents the reference of the standard natural logarithm, which is obtained for $a=b=0$.
  • Figure 3: Surface plots of the Tempesta $(\alpha,\kappa)$-logarithm for various values of hyperparameters $\alpha$ and $kappa$. These figures illustrate the $(\alpha,\kappa)$-logarithm in terms of $\alpha$ and $x$ for fixed $\kappa=0.7$ and $\kappa=-0.9$ The black continuous line represents the reference of the standard logarithm, which is obtained for $\alpha=1$ and $\kappa=0$.