The Hidden Cost of Approximation in Online Mirror Descent
Ofir Schlisselberg, Uri Sherman, Tomer Koren, Yishay Mansour
TL;DR
<3-5 sentence high-level summary> The paper analyzes how approximation errors in Online Mirror Descent affect regret, revealing a fundamental link between the regularizer's geometry (smooth vs barrier) and robustness to inexact updates. It derives a tight bound for smooth regularizers and a sharp dichotomy for barrier regularizers on the simplex: negative entropy requires exponentially small errors to avoid linear regret, while log-barrier and Tsallis maintain optimal regret with polynomial errors; in stochastic settings, negative entropy regains robustness on the full simplex but not on every subset. A balance framework is developed to connect loss balance, effective smoothness, and trajectory behavior, and the results are extended to FTRL-like updates, with practical implications for choosing regularizers and tolerances in online optimization under both adversarial and stochastic losses.
Abstract
Online mirror descent (OMD) is a fundamental algorithmic paradigm that underlies many algorithms in optimization, machine learning and sequential decision-making. The OMD iterates are defined as solutions to optimization subproblems which, oftentimes, can be solved only approximately, leading to an inexact version of the algorithm. Nonetheless, existing OMD analyses typically assume an idealized error free setting, thereby limiting our understanding of performance guarantees that should be expected in practice. In this work we initiate a systematic study into inexact OMD, and uncover an intricate relation between regularizer smoothness and robustness to approximation errors. When the regularizer is uniformly smooth, we establish a tight bound on the excess regret due to errors. Then, for barrier regularizers over the simplex and its subsets, we identify a sharp separation: negative entropy requires exponentially small errors to avoid linear regret, whereas log-barrier and Tsallis regularizers remain robust even when the errors are only polynomial. Finally, we show that when the losses are stochastic and the domain is the simplex, negative entropy regains robustness-but this property does not extend to all subsets, where exponentially small errors are again necessary to avoid suboptimal regret.
