Table of Contents
Fetching ...

Online Allocation with Replenishable Budgets: Worst Case and Beyond

Jianyi Yang, Pengfei Li, Mohammad Jaminur Islam, Shaolei Ren

TL;DR

This work addresses online resource allocation with replenishable budgets by introducing OACP, which conservatively prices resources via dual mirror descent and opportunistically uses replenishment, achieving an asymptotic competitive ratio matching the fixed-budget benchmark. Extending to frames with minimum replenishment, OACP+ improves the ratio under mild replenishment assumptions. To bridge worst-case guarantees with practical performance, the paper develops LA-OACP, a learning-augmented algorithm that combines ML predictions with competitive decisions while enforcing a reservation utility to preserve robustness; it proves worst-case guarantees and provides an average-utility bound that reflects ML accuracy. Simulation studies on sustainable AI inference with renewable energy validate the theoretical results and demonstrate practical gains of LA-OACP over baselines, highlighting its potential for energy-aware AI systems. The results advance online allocation by addressing budget replenishment and integrating learning without sacrificing worst-case performance.

Abstract

This paper studies online resource allocation with replenishable budgets, where budgets can be replenished on top of the initial budget and an agent sequentially chooses online allocation decisions without violating the available budget constraint at each round. We propose a novel online algorithm, called OACP (Opportunistic Allocation with Conservative Pricing), that conservatively adjusts dual variables while opportunistically utilizing available resources. OACP achieves a bounded asymptotic competitive ratio in adversarial settings as the number of decision rounds T gets large. Importantly, the asymptotic competitive ratio of OACP is optimal in the absence of additional assumptions on budget replenishment. To further improve the competitive ratio, we make a mild assumption that there is budget replenishment every T^* >= 1 decision rounds and propose OACP+ to dynamically adjust the total budget assignment for online allocation. Next, we move beyond the worst-case and propose LA-OACP (Learning-Augmented OACP/OACP+), a novel learning-augmented algorithm for online allocation with replenishable budgets. We prove that LA-OACP can improve the average utility compared to OACP/OACP+ when the ML predictor is properly trained, while still offering worst-case utility guarantees when the ML predictions are arbitrarily wrong. Finally, we run simulation studies of sustainable AI inference powered by renewables, validating our analysis and demonstrating the empirical benefits of LA-OACP.

Online Allocation with Replenishable Budgets: Worst Case and Beyond

TL;DR

This work addresses online resource allocation with replenishable budgets by introducing OACP, which conservatively prices resources via dual mirror descent and opportunistically uses replenishment, achieving an asymptotic competitive ratio matching the fixed-budget benchmark. Extending to frames with minimum replenishment, OACP+ improves the ratio under mild replenishment assumptions. To bridge worst-case guarantees with practical performance, the paper develops LA-OACP, a learning-augmented algorithm that combines ML predictions with competitive decisions while enforcing a reservation utility to preserve robustness; it proves worst-case guarantees and provides an average-utility bound that reflects ML accuracy. Simulation studies on sustainable AI inference with renewable energy validate the theoretical results and demonstrate practical gains of LA-OACP over baselines, highlighting its potential for energy-aware AI systems. The results advance online allocation by addressing budget replenishment and integrating learning without sacrificing worst-case performance.

Abstract

This paper studies online resource allocation with replenishable budgets, where budgets can be replenished on top of the initial budget and an agent sequentially chooses online allocation decisions without violating the available budget constraint at each round. We propose a novel online algorithm, called OACP (Opportunistic Allocation with Conservative Pricing), that conservatively adjusts dual variables while opportunistically utilizing available resources. OACP achieves a bounded asymptotic competitive ratio in adversarial settings as the number of decision rounds T gets large. Importantly, the asymptotic competitive ratio of OACP is optimal in the absence of additional assumptions on budget replenishment. To further improve the competitive ratio, we make a mild assumption that there is budget replenishment every T^* >= 1 decision rounds and propose OACP+ to dynamically adjust the total budget assignment for online allocation. Next, we move beyond the worst-case and propose LA-OACP (Learning-Augmented OACP/OACP+), a novel learning-augmented algorithm for online allocation with replenishable budgets. We prove that LA-OACP can improve the average utility compared to OACP/OACP+ when the ML predictor is properly trained, while still offering worst-case utility guarantees when the ML predictions are arbitrarily wrong. Finally, we run simulation studies of sustainable AI inference powered by renewables, validating our analysis and demonstrating the empirical benefits of LA-OACP.
Paper Structure (26 sections, 6 theorems, 13 equations, 2 figures, 2 tables, 3 algorithms)

This paper contains 26 sections, 6 theorems, 13 equations, 2 figures, 2 tables, 3 algorithms.

Key Result

Theorem 3.1

For any episode $y\in\mathcal{Y}$ and $\eta>0$, by Algorithm alg:expert, the utility of OACP satisfies where $\alpha= \max_{m\in[M]} \frac{\bar{x}_m}{\rho_m}$, $\bar{\rho}=\max_{m\in[M]}\rho_m$ is the maximum per-round average budget initially assigned to the agent at round $t=1$, $\bar{x}$ is the maximum per-round resource allocation constraint, $V_h(\mu,\mu_1)$ is the Bregman divergence between

Figures (2)

  • Figure 1: (a) Average utility of LA-OACP with varying $\lambda\in[0,1]$; (b) Empirical competitive ratio of LA-OACP with varying $\lambda\in[0,1]$ (dotted lines represent the theoretical competitive ratio bounds); (c) Utility constraint \ref{['eqn:constraint_1_learning']} violation probability by the pure ML predictor.
  • Figure 2: An example of budget assignment with $T=7T^*$. Colored rectangles indicate the amount of remained budget and white rectangles are the spaces in the storage. Dark blue rectangles indicate permanent budgets $2^{i-1}T^*\rho$ for the current frame. Light blue rectangles indicate permanent budgets for the future frames $(T-(2^{(i)}-1)T^*)\rho$. Green rectangles indicate the budget accumulation $\min\{ B_{T_{i-1}+1}-(T-(2^{i-1}-1)T^*)\rho, 2^{i-2}T^*\rho_{\max}\odot\beta\}$.

Theorems & Definitions (9)

  • Definition 1: Asymptotic competitive ratio balseiro2019learningborodin2005online
  • Definition 2: Average utility
  • Theorem 3.1
  • Definition 3: Minimum budget replenishment
  • Theorem 3.2
  • Theorem 4.1
  • Theorem 4.2
  • Lemma A.1: OnlineAllocation_DualMirroDescent_Google_OperationalResearch_2022_doi:10.1287/opre.2021.2242tutorial_online_learning_orabona2019modern
  • Lemma B.1