Projection-free Algorithms for Online Convex Optimization with Adversarial Constraints
Dhruv Sarkar, Aprameyo Chakrabartty, Subhamon Supantha, Palash Dey, Abhishek Sinha
TL;DR
This work extends Online Convex Optimization to a constrained, adversarial setting where constraint functions are revealed online, and it eliminates the expensive projection step by using a projection-free approach based on a surrogate cost fed to an adaptive Online Conditional Gradient (OCG) method. The core strategy combines a Lyapunov-drift analysis with surrogate costs of the form $\hat{f}_t(x)= V f_t(x) + \Phi'(Q(t)) g_t^+(x)$, and relies on a single LP solve per round to move along a feasible direction. The authors establish tilde-$O(T^{3/4})$ bounds for both regret and cumulative constraint violations (CCV) in both full-information and bandit settings, improving over prior projection-free COCO methods. The framework is demonstrated on an online shortest path problem with time-varying constraints, showing favorable performance and practical efficiency, and it lays groundwork for adaptive, projection-free online learning in structured decision spaces.
Abstract
We study a generalization of the Online Convex Optimization (OCO) framework with time-varying adversarial constraints. In this problem, after selecting a feasible action from the convex decision set $X,$ a convex constraint function is revealed alongside the cost function in each round. Our goal is to design a computationally efficient learning policy that achieves a small regret with respect to the cost functions and a small cumulative constraint violation (CCV) with respect to the constraint functions over a horizon of length $T$. It is well-known that the projection step constitutes the major computational bottleneck of the standard OCO algorithms. However, for many structured decision sets, linear functions can be efficiently optimized over the decision set. We propose a *projection-free* online policy which makes a single call to a Linear Program (LP) solver per round. Our method outperforms state-of-the-art projection-free online algorithms with adversarial constraints, achieving improved bounds of $\tilde{O}(T^{\frac{3}{4}})$ for both regret and CCV. The proposed algorithm is conceptually simple - it first constructs a surrogate cost function as a non-negative linear combination of the cost and constraint functions. Then, it passes the surrogate costs to a new, adaptive version of the online conditional gradient subroutine, which we propose in this paper.
