Parameter-free Dynamic Regret: Time-varying Movement Costs, Delayed Feedback, and Memory
Emmanuel Esposito, Andrew Jacobsen, Hao Qiu, Mengxiao Zhang
TL;DR
This work addresses dynamic regret in unconstrained online convex optimization when movement costs vary over time. It develops a parameter-free, comparator-adaptive algorithm based on Composite Mirror Descent with a time-varying correction that scales with $eta_t= orm{g_t}+ ext{movement}$, and augments it with a meta-learning layer to tune learning rates without prior knowledge. The authors prove near-optimal dynamic regret bounds that adapt to comparator complexity, gradient magnitudes, and time-varying movement costs, and they show two novel reductions: OCO with delayed feedback and OCO with time-varying memory, both reduced to the movement-cost framework. The results yield first-order movement-cost adaptivity and establish parameter-free guarantees in unbounded domains, with per-round complexity $O( ext{log }T)$. These advances broaden the applicability of dynamic regret analyses to more realistic settings where switching costs fluctuate over time and feedback can be delayed or depend on memory.
Abstract
In this paper, we study dynamic regret in unconstrained online convex optimization (OCO) with movement costs. Specifically, we generalize the standard setting by allowing the movement cost coefficients $λ_t$ to vary arbitrarily over time. Our main contribution is a novel algorithm that establishes the first comparator-adaptive dynamic regret bound for this setting, guaranteeing $\widetilde{\mathcal{O}}(\sqrt{(1+P_T)(T+\sum_t λ_t)})$ regret, where $P_T$ is the path length of the comparator sequence over $T$ rounds. This recovers the optimal guarantees for both static and dynamic regret in standard OCO as a special case where $λ_t=0$ for all rounds. To demonstrate the versatility of our results, we consider two applications: OCO with delayed feedback and OCO with time-varying memory. We show that both problems can be translated into time-varying movement costs, establishing a novel reduction specifically for the delayed feedback setting that is of independent interest. A crucial observation is that the first-order dependence on movement costs in our regret bound plays a key role in enabling optimal comparator-adaptive dynamic regret guarantees in both settings.
