Demand Balancing in Primal-Dual Optimization for Blind Network Revenue Management

Sentao Miao; Yining Wang

Demand Balancing in Primal-Dual Optimization for Blind Network Revenue Management

Sentao Miao, Yining Wang

TL;DR

This work addresses blind network revenue management with unknown nonparametric demand by introducing PD-NRM, a primal-dual gradient-based algorithm that updates dual variables infrequently and employs demand balancing to control primal feasibility. The method uses a two-phase gradient-estimation procedure to learn demand statistics and a balanced price to offset inventory slack, all within an epoch-based framework that reduces computational overhead. The authors prove a nearly optimal regret bound of $\tilde{O}(N^{3.25}\sqrt{T})$ (without $o(\sqrt{T})$ terms) and demonstrate practical performance improvements over benchmarks in numerical experiments. The approach offers scalable, first-order optimization for nonparametric demand learning under inventory constraints, with potential extensions to broader online resource-constrained optimization problems.

Abstract

This paper proposes a practically efficient algorithm with optimal theoretical regret which solves the classical network revenue management (NRM) problem with unknown, nonparametric demand. Over a time horizon of length $T$, in each time period the retailer needs to decide prices of $N$ types of products which are produced based on $M$ types of resources with unreplenishable initial inventory. When demand is nonparametric with some mild assumptions, Miao and Wang (2021) is the first paper which proposes an algorithm with $O(\text{poly}(N,M,\ln(T))\sqrt{T})$ type of regret (in particular, $\tilde O(N^{3.5}\sqrt{T})$ plus additional high-order terms that are $o(\sqrt{T})$ with sufficiently large $T\gg N$). In this paper, we improve the previous result by proposing a primal-dual optimization algorithm which is not only more practical, but also with an improved regret of $\tilde O(N^{3.25}\sqrt{T})$ free from additional high-order terms. A key technical contribution of the proposed algorithm is the so-called demand balancing, which pairs the primal solution (i.e., the price) in each time period with another price to offset the violation of complementary slackness on resource inventory constraints. Numerical experiments compared with several benchmark algorithms further illustrate the effectiveness of our algorithm.

Demand Balancing in Primal-Dual Optimization for Blind Network Revenue Management

TL;DR

(without

terms) and demonstrate practical performance improvements over benchmarks in numerical experiments. The approach offers scalable, first-order optimization for nonparametric demand learning under inventory constraints, with potential extensions to broader online resource-constrained optimization problems.

Abstract

, in each time period the retailer needs to decide prices of

types of products which are produced based on

types of resources with unreplenishable initial inventory. When demand is nonparametric with some mild assumptions, Miao and Wang (2021) is the first paper which proposes an algorithm with

type of regret (in particular,

plus additional high-order terms that are

with sufficiently large

). In this paper, we improve the previous result by proposing a primal-dual optimization algorithm which is not only more practical, but also with an improved regret of

free from additional high-order terms. A key technical contribution of the proposed algorithm is the so-called demand balancing, which pairs the primal solution (i.e., the price) in each time period with another price to offset the violation of complementary slackness on resource inventory constraints. Numerical experiments compared with several benchmark algorithms further illustrate the effectiveness of our algorithm.

Paper Structure (30 sections, 10 theorems, 97 equations, 3 figures, 3 tables, 3 algorithms)

This paper contains 30 sections, 10 theorems, 97 equations, 3 figures, 3 tables, 3 algorithms.

Introduction
Infeasible primal-dual methods with frequent dual updates
Infeasible primal-dual methods with infrequent dual updates
Summary of our contributions
Other related works
Problem Formulation and Assumptions
Admissible policy and rewards
Assumptions
Fluid Approximation and the Dual Problem
Fluid approximation
Dual problem
Additional technical assumptions
Algorithm Design and Theoretical Results
Gradient estimation and demand balancing
Primal optimization
...and 15 more sections

Key Result

Proposition 1

Let $d^*$ be the optimal solution to Eq. (eq:fluid-d). Then $T\phi(d^*)\geq \mathrm{Rwd}(\pi^*)$, where $\pi^*=\arg\max_{\pi\in\Pi}\mathrm{Rwd}(\pi)$ is the optimal admissible policy.

Figures (3)

Figure 1: Structure of Algorithm PD-NRM.
Figure 2: Graphical illustration of the demand balancing procedure.
Figure 3: Percentage loss of all algorithms

Theorems & Definitions (13)

Definition 1: admissible policy
Proposition 1: gallego1994optimal
Lemma 1: strong duality
Lemma 2
Remark 1
Lemma 3
Lemma 4
Lemma 5
Lemma 6
Lemma 7
...and 3 more

Demand Balancing in Primal-Dual Optimization for Blind Network Revenue Management

TL;DR

Abstract

Demand Balancing in Primal-Dual Optimization for Blind Network Revenue Management

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (13)