Table of Contents
Fetching ...

Inverse Optimization for Routing Problems

Pedro Zattoni Scroccaro, Piet van Beek, Peyman Mohajerin Esfahani, Bilge Atasoy

TL;DR

This paper addresses the problem of learning decision-makers' routing preferences from expert routes by formulating routing as an inverse-optimization task. It introduces an affine hypothesis class with nonnegative edge weights, a tailored loss function that yields a convex inner problem for binary routing decisions, and a specialized first-order algorithm with reshuffled updates for efficiency on large datasets. The methodology is demonstrated across CVRP, VRPTW, and TSP settings, and is applied to the Amazon Last Mile Routing Challenge, where the tailored IO approach achieves 0.0302, ranking 2nd among 48 finalists. The work shows how combining descriptive patterns with data-driven learning yields practical, scalable models that replicate expert driving behavior, with strong potential for real-time learning and deployment in routing tools.

Abstract

We propose a method for learning decision-makers' behavior in routing problems using Inverse Optimization (IO). The IO framework falls into the supervised learning category and builds on the premise that the target behavior is an optimizer of an unknown cost function. This cost function is to be learned through historical data, and in the context of routing problems, can be interpreted as the routing preferences of the decision-makers. In this view, the main contributions of this study are to propose an IO methodology with a hypothesis function, loss function, and stochastic first-order algorithm tailored to routing problems. We further test our IO approach in the Amazon Last Mile Routing Research Challenge, where the goal is to learn models that replicate the routing preferences of human drivers, using thousands of real-world routing examples. Our final IO-learned routing model achieves a score that ranks 2nd compared with the 48 models that qualified for the final round of the challenge. Our examples and results showcase the flexibility and real-world potential of the proposed IO methodology to learn from decision-makers' decisions in routing problems.

Inverse Optimization for Routing Problems

TL;DR

This paper addresses the problem of learning decision-makers' routing preferences from expert routes by formulating routing as an inverse-optimization task. It introduces an affine hypothesis class with nonnegative edge weights, a tailored loss function that yields a convex inner problem for binary routing decisions, and a specialized first-order algorithm with reshuffled updates for efficiency on large datasets. The methodology is demonstrated across CVRP, VRPTW, and TSP settings, and is applied to the Amazon Last Mile Routing Challenge, where the tailored IO approach achieves 0.0302, ranking 2nd among 48 finalists. The work shows how combining descriptive patterns with data-driven learning yields practical, scalable models that replicate expert driving behavior, with strong potential for real-time learning and deployment in routing tools.

Abstract

We propose a method for learning decision-makers' behavior in routing problems using Inverse Optimization (IO). The IO framework falls into the supervised learning category and builds on the premise that the target behavior is an optimizer of an unknown cost function. This cost function is to be learned through historical data, and in the context of routing problems, can be interpreted as the routing preferences of the decision-makers. In this view, the main contributions of this study are to propose an IO methodology with a hypothesis function, loss function, and stochastic first-order algorithm tailored to routing problems. We further test our IO approach in the Amazon Last Mile Routing Research Challenge, where the goal is to learn models that replicate the routing preferences of human drivers, using thousands of real-world routing examples. Our final IO-learned routing model achieves a score that ranks 2nd compared with the 48 models that qualified for the final round of the challenge. Our examples and results showcase the flexibility and real-world potential of the proposed IO methodology to learn from decision-makers' decisions in routing problems.
Paper Structure (20 sections, 1 theorem, 19 equations, 14 figures, 2 tables, 1 algorithm)

This paper contains 20 sections, 1 theorem, 19 equations, 14 figures, 2 tables, 1 algorithm.

Key Result

Proposition 2.1

Assume $\mathbb{X} \subseteq \{0, 1\}^p$, that is, the decision variables of the FOP are binary. Then, the loss function eq:loss_function is equivalent to the ASL $\ell^{\emph{ASL}}_\theta(\hat{s},\hat{x}) = \langle \theta,\phi(\hat{s},\hat{x}) \rangle - \min_{x \in \mathbb{X}(\hat{s})} \{ \langle \

Figures (14)

  • Figure 1: Optimal SCVRP tour and representation of graph weights.
  • Figure 2: Two iterations of Algorithm \ref{['alg:first_order']}. The figures in the first column represent the learned weights, the figures in the second column are optimal SCVRP routes for the respective weights shown in the first column, and the third column represents the subgradient in line 6 of Algorithm \ref{['alg:first_order']}, where red (green) edges represent weights that should be increased (decreased).
  • Figure 3: Results for the VRPTW scenario.
  • Figure 4: Illustration of signal and expert response for an R-TSP.
  • Figure 5: High-level description of data fields provided in the Amazon Challenge data set merchan20222021.
  • ...and 9 more figures

Theorems & Definitions (3)

  • Proposition 2.1: Connection between the ASL and \ref{['eq:loss_function']}
  • proof
  • Remark 2.2: Approximate A-FOP