Parameter-free Dynamic Regret: Time-varying Movement Costs, Delayed Feedback, and Memory

Emmanuel Esposito; Andrew Jacobsen; Hao Qiu; Mengxiao Zhang

Parameter-free Dynamic Regret: Time-varying Movement Costs, Delayed Feedback, and Memory

Emmanuel Esposito, Andrew Jacobsen, Hao Qiu, Mengxiao Zhang

TL;DR

This work addresses dynamic regret in unconstrained online convex optimization when movement costs vary over time. It develops a parameter-free, comparator-adaptive algorithm based on Composite Mirror Descent with a time-varying correction that scales with $eta_t= orm{g_t}+ ext{movement}$, and augments it with a meta-learning layer to tune learning rates without prior knowledge. The authors prove near-optimal dynamic regret bounds that adapt to comparator complexity, gradient magnitudes, and time-varying movement costs, and they show two novel reductions: OCO with delayed feedback and OCO with time-varying memory, both reduced to the movement-cost framework. The results yield first-order movement-cost adaptivity and establish parameter-free guarantees in unbounded domains, with per-round complexity $O( ext{log }T)$. These advances broaden the applicability of dynamic regret analyses to more realistic settings where switching costs fluctuate over time and feedback can be delayed or depend on memory.

Abstract

In this paper, we study dynamic regret in unconstrained online convex optimization (OCO) with movement costs. Specifically, we generalize the standard setting by allowing the movement cost coefficients $λ_t$ to vary arbitrarily over time. Our main contribution is a novel algorithm that establishes the first comparator-adaptive dynamic regret bound for this setting, guaranteeing $\widetilde{\mathcal{O}}(\sqrt{(1+P_T)(T+\sum_t λ_t)})$ regret, where $P_T$ is the path length of the comparator sequence over $T$ rounds. This recovers the optimal guarantees for both static and dynamic regret in standard OCO as a special case where $λ_t=0$ for all rounds. To demonstrate the versatility of our results, we consider two applications: OCO with delayed feedback and OCO with time-varying memory. We show that both problems can be translated into time-varying movement costs, establishing a novel reduction specifically for the delayed feedback setting that is of independent interest. A crucial observation is that the first-order dependence on movement costs in our regret bound plays a key role in enabling optimal comparator-adaptive dynamic regret guarantees in both settings.

Parameter-free Dynamic Regret: Time-varying Movement Costs, Delayed Feedback, and Memory

TL;DR

, and augments it with a meta-learning layer to tune learning rates without prior knowledge. The authors prove near-optimal dynamic regret bounds that adapt to comparator complexity, gradient magnitudes, and time-varying movement costs, and they show two novel reductions: OCO with delayed feedback and OCO with time-varying memory, both reduced to the movement-cost framework. The results yield first-order movement-cost adaptivity and establish parameter-free guarantees in unbounded domains, with per-round complexity

. These advances broaden the applicability of dynamic regret analyses to more realistic settings where switching costs fluctuate over time and feedback can be delayed or depend on memory.

Abstract

to vary arbitrarily over time. Our main contribution is a novel algorithm that establishes the first comparator-adaptive dynamic regret bound for this setting, guaranteeing

regret, where

is the path length of the comparator sequence over

rounds. This recovers the optimal guarantees for both static and dynamic regret in standard OCO as a special case where

for all rounds. To demonstrate the versatility of our results, we consider two applications: OCO with delayed feedback and OCO with time-varying memory. We show that both problems can be translated into time-varying movement costs, establishing a novel reduction specifically for the delayed feedback setting that is of independent interest. A crucial observation is that the first-order dependence on movement costs in our regret bound plays a key role in enabling optimal comparator-adaptive dynamic regret guarantees in both settings.

Paper Structure (22 sections, 19 theorems, 82 equations, 5 algorithms)

This paper contains 22 sections, 19 theorems, 82 equations, 5 algorithms.

Introduction
Contributions
Related works
OCO with movement costs.
Parameter-free online learning.
Dynamic regret.
OCO with delayed feedback.
Problem Setting
Goal.
Other notations.
Parameter-free OCO with Movement Costs
Improved Adaptivity to Movement Costs
Applications
Unconstrained OCO with Delayed Feedback
Unconstrained OCO with Time-varying Memory
...and 7 more sections

Key Result

Theorem 3.1

Assume that $f_1,\dots, f_T$ are $G$-Lipschitz convex functions. For any comparator sequence $(u_{1},\ldots,u_{T})\in\mathcal{W}^T$, alg:pf-mc with any $L \ge G +\lambda_{\max}$ and any $\epsilon>0$ guarantees where $\widetilde{P}_T(\epsilon) \triangleq \sum_{t=2}^{T}\|u_{t}-u_{{t-1}}\|\log\bigl(\frac{\|u_{t}-u_{{t-1}}\|T^2}{\epsilon}+1\bigr)$ and $\widetilde{M}_T(\epsilon) \triangleq M\bigl(\log

Theorems & Definitions (33)

Theorem 3.1
Theorem 4.1
Remark 4.2
Lemma 5.0
Theorem 5.1
proof : Proof sketch
Corollary 5.2
Lemma 5.4
Theorem 5.5
Lemma A.0
...and 23 more

Parameter-free Dynamic Regret: Time-varying Movement Costs, Delayed Feedback, and Memory

TL;DR

Abstract

Parameter-free Dynamic Regret: Time-varying Movement Costs, Delayed Feedback, and Memory

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (33)