Tight Bounds for Online Convex Optimization with Adversarial Constraints

Abhishek Sinha; Rahul Vaze

Tight Bounds for Online Convex Optimization with Adversarial Constraints

Abhishek Sinha, Rahul Vaze

TL;DR

This work tackles constrained online convex optimization (COCO) under adaptive adversaries, aiming to minimize regret while controlling cumulative constraint violations (CCV). It introduces a Lyapunov drift framework together with AdaGrad on carefully constructed surrogate costs to balance objective loss and constraint penalties, enabling an $O(\sqrt{T})$ regret and $\tilde{O}(\sqrt{T})$ CCV without restrictive assumptions. The key ideas are the surrogate cost $\hat{f}_t = V\tilde{f}_t + \Phi'(Q(t))\tilde{g}_t$, an exponential Lyapunov function $\Phi(x)=e^{\lambda x}-1$, and a horizon-free, parameter-free design that yields explicit bounds for both regret and CCV, as well as a small movement cost $\tilde{O}(\sqrt{T})$. The results significantly advance the COCO literature by closing the gap to the $O(\sqrt{T})$ lower bound for CCV under general adversaries, with practical, efficient updates and potential applicability to related constrained online learning problems.

Abstract

A well-studied generalization of the standard online convex optimization (OCO) is constrained online convex optimization (COCO). In COCO, on every round, a convex cost function and a convex constraint function are revealed to the learner after the action for that round is chosen. The objective is to design an online policy that simultaneously achieves a small regret while ensuring small cumulative constraint violation (CCV) against an adaptive adversary. A long-standing open question in COCO is whether an online policy can simultaneously achieve $O(\sqrt{T})$ regret and $O(\sqrt{T})$ CCV without any restrictive assumptions. For the first time, we answer this in the affirmative and show that an online policy can simultaneously achieve $O(\sqrt{T})$ regret and $\tilde{O}(\sqrt{T})$ CCV. We establish this result by effectively combining the adaptive regret bound of the AdaGrad algorithm with Lyapunov optimization - a classic tool from control theory. Surprisingly, the analysis is short and elegant.

Tight Bounds for Online Convex Optimization with Adversarial Constraints

TL;DR

regret and

CCV without restrictive assumptions. The key ideas are the surrogate cost

, an exponential Lyapunov function

, and a horizon-free, parameter-free design that yields explicit bounds for both regret and CCV, as well as a small movement cost

. The results significantly advance the COCO literature by closing the gap to the

lower bound for CCV under general adversaries, with practical, efficient updates and potential applicability to related constrained online learning problems.

Abstract

regret and

CCV without any restrictive assumptions. For the first time, we answer this in the affirmative and show that an online policy can simultaneously achieve

regret and

CCV. We establish this result by effectively combining the adaptive regret bound of the AdaGrad algorithm with Lyapunov optimization - a classic tool from control theory. Surprisingly, the analysis is short and elegant.

Paper Structure (27 sections, 2 theorems, 31 equations, 1 table, 1 algorithm)

This paper contains 27 sections, 2 theorems, 31 equations, 1 table, 1 algorithm.

Introduction
Related Work
Problem formulation
Note:
Optimal Regret and Constraint Violation Bounds for COCO
Overview of the technique
Preliminaries
Design and Analysis of the Algorithm
Surrogate cost functions
The Regret Decomposition Inequality
Analysis
An Exponential Lyapunov function:
Bounding the Regret:
Bounding the CCV:
Remarks:
...and 12 more sections

Key Result

Theorem 4.1

orabona2019modern The AdaGrad policy, with the above step size sequence, achieves the following regret bound for the standard OCO problem:

Theorems & Definitions (2)

Theorem 4.1
Theorem 4.2

Tight Bounds for Online Convex Optimization with Adversarial Constraints

TL;DR

Abstract

Tight Bounds for Online Convex Optimization with Adversarial Constraints

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (2)