A Re-solving Heuristic for Dynamic Assortment Optimization with Knapsack Constraints

Xi Chen; Mo Liu; Yining Wang; Yuan Zhou

A Re-solving Heuristic for Dynamic Assortment Optimization with Knapsack Constraints

Xi Chen, Mo Liu, Yining Wang, Yuan Zhou

TL;DR

This work tackles dynamic assortment optimization under knapsack constraints with a multinomial-logit demand model. It introduces an epoch-based re-solving algorithm that reformulates the non-linear fluid objective into a sequence of linear programs via a denominator-to-constraint transformation and uses Birkhoff-von-Neumann decomposition-based sampling to realize feasible assortments. The authors prove that the cumulative regret scales logarithmically with the time horizon $T$ and problem parameters under standard assumptions, and they validate the approach with numerical experiments showing superior performance over baselines. The method offers a scalable, theoretically grounded approach for revenue management problems with inventory constraints and non-linear choice effects, with potential extensions to learning unknown models and more complex choice structures.

Abstract

In this paper, we consider a multi-stage dynamic assortment optimization problem with multi-nomial choice modeling (MNL) under resource knapsack constraints. Given the current resource inventory levels, the retailer makes an assortment decision at each period, and the goal of the retailer is to maximize the total profit from purchases. With the exact optimal dynamic assortment solution being computationally intractable, a practical strategy is to adopt the re-solving technique that periodically re-optimizes deterministic linear programs (LP) arising from fluid approximation. However, the fractional structure of MNL makes the fluid approximation in assortment optimization non-linear, which brings new technical challenges. To address this challenge, we propose a new epoch-based re-solving algorithm that effectively transforms the denominator of the objective into the constraint, so that the re-solving technique is applied to a linear program with additional slack variables amenable to practical computations and theoretical analysis. Theoretically, we prove that the regret (i.e., the gap between the resolving policy and the optimal objective of the fluid approximation) scales logarithmically with the length of time horizon and resource capacities.

A Re-solving Heuristic for Dynamic Assortment Optimization with Knapsack Constraints

TL;DR

and problem parameters under standard assumptions, and they validate the approach with numerical experiments showing superior performance over baselines. The method offers a scalable, theoretically grounded approach for revenue management problems with inventory constraints and non-linear choice effects, with potential extensions to learning unknown models and more complex choice structures.

Abstract

Paper Structure (23 sections, 12 theorems, 47 equations, 2 figures, 4 algorithms)

This paper contains 23 sections, 12 theorems, 47 equations, 2 figures, 4 algorithms.

Introduction
Problem Formulation and Preliminaries
Fluid approximation and fractional relaxations
Assumptions
Re-solving Policy
Solving the fluid approximation problem
Sampling from the fluid solution
An epoch-based algorithm with re-solving
Regret Analysis
Recursion of $\gamma^\tau$ and $s^\tau$
Analysis of stopping time
Stability analysis of $\Psi$
Completing the proof
Numerical results
Conclusions and Future Directions
...and 8 more sections

Key Result

Lemma 1

For any admissible policy $\pi$ over $T$ time periods, it holds that $\mathbb E^\pi[\sum_{t=1}^T R(S_t)] \leq T R(x^F)$.

Figures (2)

Figure 1: Average revenue (top panel) and regret (bottom) performances for $N=10$ products and $K=3$ capacity constraint, with the number of resources $M$ ranging from 5 to 15 and the number of time periods $T$ ranging from $2^5=32$ to $2^{15}=32768$.
Figure 2: Average revenue (top panel) and regret (bottom) performances for $N=20$ products and $K=5$ capacity constraint, with the number of resources $M$ ranging from 10 to 30 and the number of time periods $T$ ranging from $2^5=32$ to $2^{15}=32768$.

Theorems & Definitions (27)

Definition 1: Fluid approximation
Lemma 1
Remark 1
Lemma 2
Lemma 3
Proposition 1
Remark 2
Theorem 1
Remark 3
Lemma 4
...and 17 more

A Re-solving Heuristic for Dynamic Assortment Optimization with Knapsack Constraints

TL;DR

Abstract

A Re-solving Heuristic for Dynamic Assortment Optimization with Knapsack Constraints

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (27)