An End-to-End Learning Approach for Solving Capacitated Location-Routing Problems

Changhao Miao; Yuntian Zhang; Tongyu Wu; Fang Deng; Chen Chen

An End-to-End Learning Approach for Solving Capacitated Location-Routing Problems

Changhao Miao, Yuntian Zhang, Tongyu Wu, Fang Deng, Chen Chen

TL;DR

This work addresses capacitated location-routing problems (CLRPs) and open CLRPs (OCLRP) by introducing DRLHQ, an end-to-end DRL framework built on an encoder–decoder that reformulates CLRPs as a Markov decision process and employs a heterogeneous querying attention with a dynamic masking policy. The method integrates location and routing decisions within a single MDP, leveraging POMO-based training and a GRU-enhanced location query to capture interdependencies, while instance augmentation and simulation-based beam search boost inference. Experimental results on synthetic and Prins benchmark datasets show DRLHQ achieving superior solution quality and generalization compared with exact solvers, classical heuristics, and prior DRL approaches, with ablations confirming the value of dynamic masking and heterogeneous queries. The approach offers a practical, scalable, end-to-end solution for CLRPs, with potential impact in supply-chain, emergency planning, and disaster relief where joint facility and routing decisions are critical. Mathematical formulations, such as the CLRPs objective $\min \; \sum_{i \in I} O_i y_i + \sum_{i \in V} \sum_{j \\in V} \sum_{k \\in K} c_{ij} x_{ijk} + \sum_{i \in I} \sum_{j \\in J} \sum_{k \\in K} F x_{ijk}$, and the MDP components, are embedded within the learning framework to guide policy optimization and feasible solution construction.

Abstract

The capacitated location-routing problems (CLRPs) are classical problems in combinatorial optimization, which require simultaneously making location and routing decisions. In CLRPs, the complex constraints and the intricate relationships between various decisions make the problem challenging to solve. With the emergence of deep reinforcement learning (DRL), it has been extensively applied to address the vehicle routing problem and its variants, while the research related to CLRPs still needs to be explored. In this paper, we propose the DRL with heterogeneous query (DRLHQ) to solve CLRP and open CLRP (OCLRP), respectively. We are the first to propose an end-to-end learning approach for CLRPs, following the encoder-decoder structure. In particular, we reformulate the CLRPs as a markov decision process tailored to various decisions, a general modeling framework that can be adapted to other DRL-based methods. To better handle the interdependency across location and routing decisions, we also introduce a novel heterogeneous querying attention mechanism designed to adapt dynamically to various decision-making stages. Experimental results on both synthetic and benchmark datasets demonstrate superior solution quality and better generalization performance of our proposed approach over representative traditional and DRL-based baselines in solving both CLRP and OCLRP.

An End-to-End Learning Approach for Solving Capacitated Location-Routing Problems

TL;DR

Abstract

An End-to-End Learning Approach for Solving Capacitated Location-Routing Problems

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)