Table of Contents
Fetching ...

Combinatorial Optimization and Machine Learning for Dynamic Inventory Routing

Toni Greif, Louis Bouvier, Christoph M. Flath, Axel Parmentier, Sonja U. K. Rohmer, Thibaut Vidal

TL;DR

A combinatorial optimization-enriched machine learning pipeline and a novel learning paradigm to solve inventory routing problems with stochastic demand and dynamic inventory updates for a capacitated prize-collecting traveling salesman problem for which well-established algorithms exist.

Abstract

We introduce a combinatorial optimization-enriched machine learning pipeline and a novel learning paradigm to solve inventory routing problems with stochastic demand and dynamic inventory updates. After each inventory update, our approach reduces replenishment and routing decisions to an optimal solution of a capacitated prize-collecting traveling salesman problem for which well-established algorithms exist. Discovering good prize parametrizations is non-trivial; therefore, we have developed a machine learning approach. We evaluate the performance of our pipeline in settings with steady-state and more complex demand patterns. Compared to previous works, the policy generated by our algorithm leads to significant cost savings, achieves lower inference time, and can even leverage contextual information.

Combinatorial Optimization and Machine Learning for Dynamic Inventory Routing

TL;DR

A combinatorial optimization-enriched machine learning pipeline and a novel learning paradigm to solve inventory routing problems with stochastic demand and dynamic inventory updates for a capacitated prize-collecting traveling salesman problem for which well-established algorithms exist.

Abstract

We introduce a combinatorial optimization-enriched machine learning pipeline and a novel learning paradigm to solve inventory routing problems with stochastic demand and dynamic inventory updates. After each inventory update, our approach reduces replenishment and routing decisions to an optimal solution of a capacitated prize-collecting traveling salesman problem for which well-established algorithms exist. Discovering good prize parametrizations is non-trivial; therefore, we have developed a machine learning approach. We evaluate the performance of our pipeline in settings with steady-state and more complex demand patterns. Compared to previous works, the policy generated by our algorithm leads to significant cost savings, achieves lower inference time, and can even leverage contextual information.
Paper Structure (27 sections, 1 theorem, 27 equations, 9 figures, 5 tables, 2 algorithms)

This paper contains 27 sections, 1 theorem, 27 equations, 9 figures, 5 tables, 2 algorithms.

Key Result

Proposition 1

For every state $x^t$ there exists a prize vector $\theta \in \mathbb{R}^{|\mathcal{V}_c|}$ such that any optimal solution of the CPCTSP eq:CPCTSP corresponds to an optimal decision in terms of the expected total (negative) reward of the MDP eq:MDP_policy.

Figures (9)

  • Figure 1: Process of making decisions, revealing demand, and updating inventories over time
  • Figure 2: Our CO-enriched ML pipeline.
  • Figure 3: Layers of our statistical model.
  • Figure 4: Performance and inference time per instance [horizontal line highlights the mean performance].
  • Figure 5: End-of-horizon effect using the learning paradigm of baty2023combinatorial.
  • ...and 4 more figures

Theorems & Definitions (2)

  • Proposition 1
  • proof