A Mirror Descent Perspective of Smoothed Sign Descent

Shuyang Wang; Diego Klabjan

A Mirror Descent Perspective of Smoothed Sign Descent

Shuyang Wang, Diego Klabjan

TL;DR

This work uses the mirror descent framework to study the dynamics of smoothed sign descent with a stability constant $\varepsilon$ for regression problems, and proposes a mirror map that establishes equivalence to dual dynamics under some assumptions.

Abstract

Recent work by Woodworth et al. (2020) shows that the optimization dynamics of gradient descent for overparameterized problems can be viewed as low-dimensional dual dynamics induced by a mirror map, explaining the implicit regularization phenomenon from the mirror descent perspective. However, the methodology does not apply to algorithms where update directions deviate from true gradients, such as ADAM. We use the mirror descent framework to study the dynamics of smoothed sign descent with a stability constant $\varepsilon$ for regression problems. We propose a mirror map that establishes equivalence to dual dynamics under some assumptions. By studying dual dynamics, we characterize the convergent solution as an approximate KKT point of minimizing a Bregman divergence style function, and show the benefit of tuning the stability constant $\varepsilon$ to reduce the KKT error.

A Mirror Descent Perspective of Smoothed Sign Descent

TL;DR

This work uses the mirror descent framework to study the dynamics of smoothed sign descent with a stability constant

for regression problems, and proposes a mirror map that establishes equivalence to dual dynamics under some assumptions.

Abstract

for regression problems. We propose a mirror map that establishes equivalence to dual dynamics under some assumptions. By studying dual dynamics, we characterize the convergent solution as an approximate KKT point of minimizing a Bregman divergence style function, and show the benefit of tuning the stability constant

to reduce the KKT error.

A Mirror Descent Perspective of Smoothed Sign Descent

TL;DR

Abstract

A Mirror Descent Perspective of Smoothed Sign Descent

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (31)