Table of Contents
Fetching ...

A Re-solving Heuristic for Dynamic Assortment Optimization with Knapsack Constraints

Xi Chen, Mo Liu, Yining Wang, Yuan Zhou

TL;DR

This work tackles dynamic assortment optimization under knapsack constraints with a multinomial-logit demand model. It introduces an epoch-based re-solving algorithm that reformulates the non-linear fluid objective into a sequence of linear programs via a denominator-to-constraint transformation and uses Birkhoff-von-Neumann decomposition-based sampling to realize feasible assortments. The authors prove that the cumulative regret scales logarithmically with the time horizon $T$ and problem parameters under standard assumptions, and they validate the approach with numerical experiments showing superior performance over baselines. The method offers a scalable, theoretically grounded approach for revenue management problems with inventory constraints and non-linear choice effects, with potential extensions to learning unknown models and more complex choice structures.

Abstract

In this paper, we consider a multi-stage dynamic assortment optimization problem with multi-nomial choice modeling (MNL) under resource knapsack constraints. Given the current resource inventory levels, the retailer makes an assortment decision at each period, and the goal of the retailer is to maximize the total profit from purchases. With the exact optimal dynamic assortment solution being computationally intractable, a practical strategy is to adopt the re-solving technique that periodically re-optimizes deterministic linear programs (LP) arising from fluid approximation. However, the fractional structure of MNL makes the fluid approximation in assortment optimization non-linear, which brings new technical challenges. To address this challenge, we propose a new epoch-based re-solving algorithm that effectively transforms the denominator of the objective into the constraint, so that the re-solving technique is applied to a linear program with additional slack variables amenable to practical computations and theoretical analysis. Theoretically, we prove that the regret (i.e., the gap between the resolving policy and the optimal objective of the fluid approximation) scales logarithmically with the length of time horizon and resource capacities.

A Re-solving Heuristic for Dynamic Assortment Optimization with Knapsack Constraints

TL;DR

This work tackles dynamic assortment optimization under knapsack constraints with a multinomial-logit demand model. It introduces an epoch-based re-solving algorithm that reformulates the non-linear fluid objective into a sequence of linear programs via a denominator-to-constraint transformation and uses Birkhoff-von-Neumann decomposition-based sampling to realize feasible assortments. The authors prove that the cumulative regret scales logarithmically with the time horizon and problem parameters under standard assumptions, and they validate the approach with numerical experiments showing superior performance over baselines. The method offers a scalable, theoretically grounded approach for revenue management problems with inventory constraints and non-linear choice effects, with potential extensions to learning unknown models and more complex choice structures.

Abstract

In this paper, we consider a multi-stage dynamic assortment optimization problem with multi-nomial choice modeling (MNL) under resource knapsack constraints. Given the current resource inventory levels, the retailer makes an assortment decision at each period, and the goal of the retailer is to maximize the total profit from purchases. With the exact optimal dynamic assortment solution being computationally intractable, a practical strategy is to adopt the re-solving technique that periodically re-optimizes deterministic linear programs (LP) arising from fluid approximation. However, the fractional structure of MNL makes the fluid approximation in assortment optimization non-linear, which brings new technical challenges. To address this challenge, we propose a new epoch-based re-solving algorithm that effectively transforms the denominator of the objective into the constraint, so that the re-solving technique is applied to a linear program with additional slack variables amenable to practical computations and theoretical analysis. Theoretically, we prove that the regret (i.e., the gap between the resolving policy and the optimal objective of the fluid approximation) scales logarithmically with the length of time horizon and resource capacities.
Paper Structure (23 sections, 12 theorems, 47 equations, 2 figures, 4 algorithms)

This paper contains 23 sections, 12 theorems, 47 equations, 2 figures, 4 algorithms.

Key Result

Lemma 1

For any admissible policy $\pi$ over $T$ time periods, it holds that $\mathbb E^\pi[\sum_{t=1}^T R(S_t)] \leq T R(x^F)$.

Figures (2)

  • Figure 1: Average revenue (top panel) and regret (bottom) performances for $N=10$ products and $K=3$ capacity constraint, with the number of resources $M$ ranging from 5 to 15 and the number of time periods $T$ ranging from $2^5=32$ to $2^{15}=32768$.
  • Figure 2: Average revenue (top panel) and regret (bottom) performances for $N=20$ products and $K=5$ capacity constraint, with the number of resources $M$ ranging from 10 to 30 and the number of time periods $T$ ranging from $2^5=32$ to $2^{15}=32768$.

Theorems & Definitions (27)

  • Definition 1: Fluid approximation
  • Lemma 1
  • Remark 1
  • Lemma 2
  • Lemma 3
  • Proposition 1
  • Remark 2
  • Theorem 1
  • Remark 3
  • Lemma 4
  • ...and 17 more