Combinatorial Optimization and Machine Learning for Dynamic Inventory Routing

Toni Greif; Louis Bouvier; Christoph M. Flath; Axel Parmentier; Sonja U. K. Rohmer; Thibaut Vidal

Combinatorial Optimization and Machine Learning for Dynamic Inventory Routing

Toni Greif, Louis Bouvier, Christoph M. Flath, Axel Parmentier, Sonja U. K. Rohmer, Thibaut Vidal

TL;DR

A combinatorial optimization-enriched machine learning pipeline and a novel learning paradigm to solve inventory routing problems with stochastic demand and dynamic inventory updates for a capacitated prize-collecting traveling salesman problem for which well-established algorithms exist.

Abstract

We introduce a combinatorial optimization-enriched machine learning pipeline and a novel learning paradigm to solve inventory routing problems with stochastic demand and dynamic inventory updates. After each inventory update, our approach reduces replenishment and routing decisions to an optimal solution of a capacitated prize-collecting traveling salesman problem for which well-established algorithms exist. Discovering good prize parametrizations is non-trivial; therefore, we have developed a machine learning approach. We evaluate the performance of our pipeline in settings with steady-state and more complex demand patterns. Compared to previous works, the policy generated by our algorithm leads to significant cost savings, achieves lower inference time, and can even leverage contextual information.

Combinatorial Optimization and Machine Learning for Dynamic Inventory Routing

TL;DR

Abstract

Paper Structure (27 sections, 1 theorem, 27 equations, 9 figures, 5 tables, 2 algorithms)

This paper contains 27 sections, 1 theorem, 27 equations, 9 figures, 5 tables, 2 algorithms.

Introduction
Related Work
Inventory Routing
Decision-Making with Machine Learning
Problem Description
Markov Decision Process
Policy Encoded as Machine Learning Pipeline
Combinatorial Optimization Layer
Machine Learning Layer
Learning Algorithm
Setting
Imitated Policy
Learning Problem
Fenchel-Young Loss
Dataset Generation using DAgger Algorithm
...and 12 more sections

Key Result

Proposition 1

For every state $x^t$ there exists a prize vector $\theta \in \mathbb{R}^{|\mathcal{V}_c|}$ such that any optimal solution of the CPCTSP eq:CPCTSP corresponds to an optimal decision in terms of the expected total (negative) reward of the MDP eq:MDP_policy.

Figures (9)

Figure 1: Process of making decisions, revealing demand, and updating inventories over time
Figure 2: Our CO-enriched ML pipeline.
Figure 3: Layers of our statistical model.
Figure 4: Performance and inference time per instance [horizontal line highlights the mean performance].
Figure 5: End-of-horizon effect using the learning paradigm of baty2023combinatorial.
...and 4 more figures

Theorems & Definitions (2)

Proposition 1
proof

Combinatorial Optimization and Machine Learning for Dynamic Inventory Routing

TL;DR

Abstract

Combinatorial Optimization and Machine Learning for Dynamic Inventory Routing

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (2)