Online Optimization for Randomized Network Resource Allocation with Long-Term Constraints

Ahmed Sid-Ali; Ioannis Lambadaris; Yiqiang Q. Zhao; Gennady Shaikhet; Shima Kheradmand

Online Optimization for Randomized Network Resource Allocation with Long-Term Constraints

Ahmed Sid-Ali, Ioannis Lambadaris, Yiqiang Q. Zhao, Gennady Shaikhet, Shima Kheradmand

TL;DR

This paper tackles online resource reservation in a small network where transfers between servers incur costs and unmet demand incurs violations, with a long-run budget constraint on transfer plus violation costs. It recasts the problem as online constrained optimization over distributions of reservations, and introduces a randomized online saddle-point algorithm that uses an approximate Lagrangian with a proximal term and dual updates based on past observations. The authors derive upper bounds for the $K$-benchmark regret and cumulative constraint violations, showing sublinear regret when $K=o(\sqrt{T})$, and validate the approach through numerical experiments against simple deterministic policies. The work provides a principled framework for online resource allocation under long-term constraints and demonstrates practical viability for networks where demand is unknown and adversarially varying.

Abstract

In this paper, we study an optimal online resource reservation problem in a simple communication network. The network is composed of two compute nodes linked by a local communication link. The system operates in discrete time; at each time slot, the administrator reserves resources for servers before the actual job requests are known. A cost is incurred for the reservations made. Then, after the client requests are observed, jobs may be transferred from one server to the other to best accommodate the demands by incurring an additional transport cost. If certain job requests cannot be satisfied, there is a violation that engenders a cost to pay for each of the blocked jobs. The goal is to minimize the overall reservation cost over finite horizons while maintaining the cumulative violation and transport costs under a certain budget limit. To study this problem, we first formalize it as a repeated game against nature where the reservations are drawn randomly according to a sequence of probability distributions that are derived from an online optimization problem over the space of allowable reservations. We then propose an online saddle-point algorithm for which we present an upper bound for the associated K-benchmark regret together with an upper bound for the cumulative constraint violations. Finally, we present numerical experiments where we compare the performance of our algorithm with those of simple deterministic resource allocation policies.

Online Optimization for Randomized Network Resource Allocation with Long-Term Constraints

TL;DR

-benchmark regret and cumulative constraint violations, showing sublinear regret when

, and validate the approach through numerical experiments against simple deterministic policies. The work provides a principled framework for online resource allocation under long-term constraints and demonstrates practical viability for networks where demand is unknown and adversarially varying.

Abstract

Paper Structure (16 sections, 6 theorems, 73 equations, 5 figures, 4 algorithms)

This paper contains 16 sections, 6 theorems, 73 equations, 5 figures, 4 algorithms.

Introduction
Online resource reservation in communication networks
Online randomized reservations
Online Lagrange multipliers approach
Performance analysis
Violation constraint bound
Regret against $K$ benchmark
Numerical experiments
Lazy bang-bang policy
Naive bang-bang policy
Lagrangian deterministic algorithm
Conclusion
Technical results
Proof of Theorem \ref{['Theo-fit']}
Regret against $K$ benchmark
...and 1 more sections

Key Result

Theorem 5.1

Let $\lambda_1=0$. Then, for any positive integer $\aleph\in\mathbb{N}$, the Lagrange multiplier $\lambda_t$ given by the update equation in $(lambd-update)$ is bounded as: $\lambda_t\leq \bar{\lambda}:= \theta \aleph,\hbox{for all $t\in\{1,2,\ldots\}$}$, where $\theta=\max\{\varrho,\chi \}$, with and $\varrho=\mu (2\Theta-v)$, $\alpha,\mu$ are the step sizes, $\Theta,\eta$ are defined as previou

Figures (5)

Figure 1: The network model for three nodes at time $t$
Figure 2: Time average constraint violations
Figure 3: Time average $1$-Benchmark regret
Figure 4: Time average $T$-Benchmark regret
Figure 5: Effect of $\alpha$ parameter on the distance between successive probability distributions

Theorems & Definitions (9)

Remark 1
Remark 2
Theorem 5.1
Theorem 5.2
Corollary 5.1
Remark 3
Lemma 1
Lemma 2
Lemma 3

Online Optimization for Randomized Network Resource Allocation with Long-Term Constraints

TL;DR

Abstract

Online Optimization for Randomized Network Resource Allocation with Long-Term Constraints

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (9)