Online Optimization for Randomized Network Resource Allocation with Long-Term Constraints
Ahmed Sid-Ali, Ioannis Lambadaris, Yiqiang Q. Zhao, Gennady Shaikhet, Shima Kheradmand
TL;DR
This paper tackles online resource reservation in a small network where transfers between servers incur costs and unmet demand incurs violations, with a long-run budget constraint on transfer plus violation costs. It recasts the problem as online constrained optimization over distributions of reservations, and introduces a randomized online saddle-point algorithm that uses an approximate Lagrangian with a proximal term and dual updates based on past observations. The authors derive upper bounds for the $K$-benchmark regret and cumulative constraint violations, showing sublinear regret when $K=o(\sqrt{T})$, and validate the approach through numerical experiments against simple deterministic policies. The work provides a principled framework for online resource allocation under long-term constraints and demonstrates practical viability for networks where demand is unknown and adversarially varying.
Abstract
In this paper, we study an optimal online resource reservation problem in a simple communication network. The network is composed of two compute nodes linked by a local communication link. The system operates in discrete time; at each time slot, the administrator reserves resources for servers before the actual job requests are known. A cost is incurred for the reservations made. Then, after the client requests are observed, jobs may be transferred from one server to the other to best accommodate the demands by incurring an additional transport cost. If certain job requests cannot be satisfied, there is a violation that engenders a cost to pay for each of the blocked jobs. The goal is to minimize the overall reservation cost over finite horizons while maintaining the cumulative violation and transport costs under a certain budget limit. To study this problem, we first formalize it as a repeated game against nature where the reservations are drawn randomly according to a sequence of probability distributions that are derived from an online optimization problem over the space of allowable reservations. We then propose an online saddle-point algorithm for which we present an upper bound for the associated K-benchmark regret together with an upper bound for the cumulative constraint violations. Finally, we present numerical experiments where we compare the performance of our algorithm with those of simple deterministic resource allocation policies.
