Infrequent Resolving Algorithm for Online Linear Programming

Guokai Li; Zizhuo Wang; Jingwei Zhang

Infrequent Resolving Algorithm for Online Linear Programming

Guokai Li, Zizhuo Wang, Jingwei Zhang

TL;DR

This work tackles online linear programming with unknown finite-support arrivals by introducing the Argmax with Infrequent Resolving (AIR) policy, which strategically resolves the fluid LP at a vanishingly small number of time points while performing first-order updates in-between. AIR achieves a constant regret $\mathcal{O}(1)$ by designing a resolving schedule with $\mathcal{O}(\log\log T)$ LP solves, and extends to finite resolving with $M$ solves yielding $\mathcal{O}\left(T^{(1/2+\epsilon)^{M-1}}\right)$ regret. A variant for known arrival probabilities, AIR-KP, attains $\mathcal{O}(1)$ regret with $\mathcal{O}(\log\log T)$ solves and $\mathcal{O}\left(T^{(1/2+\epsilon)^M}\right)$ regret with $M$ solves. Empirical results corroborate the theoretical gains, showing AIR’s strong performance and substantial computational savings relative to fully LP-based or LP-free baselines, even under nonstationary or Markov-modulated arrivals.

Abstract

Online linear programming (OLP) has gained significant attention from both researchers and practitioners due to its extensive applications, such as online auction, network revenue management, order fulfillment and advertising. Existing OLP algorithms fall into two categories: LP-based algorithms and LP-free algorithms. The former one typically guarantees better performance but requires solving a large number of LPs, which could be computationally expensive. In contrast, LP-free algorithm only requires first-order computations but induces a worse performance. In this work, we bridge the gap between these two extremes by proposing a well-performing algorithm, that solves LPs at a few selected time points and conducts first-order computations at other time points. Specifically, for the case where the inputs are drawn from an unknown finite-support distribution, the proposed algorithm achieves a constant regret (even for the hard "degenerate" case) while solving LPs only O(log log T) times over the time horizon T. Moreover, when we are allowed to solve LPs only M times, we design the corresponding schedule such that the proposed algorithm can guarantee a nearly O(T^((1/2)^(M-1)) regret. Our work highlights the value of resolving both at the beginning and the end of the selling horizon, and provides a novel framework to prove the performance guarantee of the proposed policy under different infrequent resolving schedules. Numerical experiments are conducted to demonstrate the efficiency of the proposed algorithms.

Infrequent Resolving Algorithm for Online Linear Programming

TL;DR

by designing a resolving schedule with

LP solves, and extends to finite resolving with

solves yielding

regret. A variant for known arrival probabilities, AIR-KP, attains

regret with

solves and

regret with

solves. Empirical results corroborate the theoretical gains, showing AIR’s strong performance and substantial computational savings relative to fully LP-based or LP-free baselines, even under nonstationary or Markov-modulated arrivals.

Abstract

Paper Structure (41 sections, 18 theorems, 48 equations, 6 figures, 8 tables, 1 algorithm)

This paper contains 41 sections, 18 theorems, 48 equations, 6 figures, 8 tables, 1 algorithm.

Introduction
Literature Review
Main Results
Argmax with Infrequent Resolving (AIR) Policy
Resolving Schedule
Proof of Regret Bound
Finite Resolving
Known Arrival Probabilities
Finite Resolving
Discussion on Resolving Schedules
Discussion on Arrival Processes
Numerical Experiments
OLP Policy Comparison
Finite Resolving
Known-Probability Case
...and 26 more sections

Key Result

Theorem 1

Given the resolving schedule $\mathcal{T}$ with $\alpha\in (0, 1)$ and $\beta\in (\frac{1}{2}, 1)$, the regret of the AIR policy is $\mathcal{O}(1)$.

Figures (6)

Figure 1: Illustration of resolving time set $\mathcal{T}=\mathcal{T}_L\cup\mathcal{T}_A$.
Figure 2: Illustration of finite-resolving time set $\mathcal{T}^{F}(M)=\mathcal{T}^{F}_L(M)\cup\mathcal{T}^{F}_A(M)$.
Figure 3: Illustration of resolving schedule $\mathcal{T}^{\mathcal{K}}$ for known-probability case.
Figure 4: Regret under different policies as functions of $\rho$ when $m=1$, $n=2$, $r_1=2$, $r_2=1$, $p_1=p_2=0.5$, $T=50,000$ and $\alpha=\beta=0.7$.
Figure 5: Regret under different policies as functions of $T$ when $m=10$ and $n=2$.
...and 1 more figures

Theorems & Definitions (22)

Theorem 1: Regret Bound
Remark 1: Comparison with Literature
Lemma 1
Lemma 2: Upper Bound
Remark 2: Alternative Benchmarks
Proposition 1
Proposition 2: Demand Approximation Error
Proposition 3: Surrogate LP
Proposition 4
Remark 3: Proof Challenges under Infrequent Resolving
...and 12 more

Infrequent Resolving Algorithm for Online Linear Programming

TL;DR

Abstract

Infrequent Resolving Algorithm for Online Linear Programming

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (22)