Table of Contents
Fetching ...

Multi-Task Learning for Routing Problem with Cross-Problem Zero-Shot Generalization

Fei Liu, Xi Lin, Zhenkun Wang, Qingfu Zhang, Xialiang Tong, Mingxuan Yuan

TL;DR

The paper tackles cross-problem generalization in neural combinatorial optimization for vehicle routing problems (VRPs) by framing multiple VRP variants as combinations of shared attributes (e.g., Time Windows, Open Routes, Backhauls, Duration Limits) and solving them with a single unified attention-based model. Through an attribute composition block and multi-task reinforcement learning, the model achieves zero-shot generalization to unseen attribute configurations, significantly reducing the need to train separate models per VRP variant. Empirical results across eleven VRPs, CVRPLib benchmarks, and an industry logistics dataset show substantial performance gains, lowering average gaps from over 20% to around 5% on average, with competitive or superior results compared to specialized single-task methods. The approach offers practical value for industry by enabling a scalable, flexible solver that generalizes across diverse routing problems without extensive re-training, with code available at the referenced repository.

Abstract

Vehicle routing problems (VRPs), which can be found in numerous real-world applications, have been an important research topic for several decades. Recently, the neural combinatorial optimization (NCO) approach that leverages a learning-based model to solve VRPs without manual algorithm design has gained substantial attention. However, current NCO methods typically require building one model for each routing problem, which significantly hinders their practical application for real-world industry problems with diverse attributes. In this work, we make the first attempt to tackle the crucial challenge of cross-problem generalization. In particular, we formulate VRPs as different combinations of a set of shared underlying attributes and solve them simultaneously via a single model through attribute composition. In this way, our proposed model can successfully solve VRPs with unseen attribute combinations in a zero-shot generalization manner. Extensive experiments are conducted on eleven VRP variants, benchmark datasets, and industry logistic scenarios. The results show that the unified model demonstrates superior performance in the eleven VRPs, reducing the average gap to around 5% from over 20% in the existing approach and achieving a significant performance boost on benchmark datasets as well as a real-world logistics application. The source code is included in https://github.com/FeiLiu36/MTNCO.

Multi-Task Learning for Routing Problem with Cross-Problem Zero-Shot Generalization

TL;DR

The paper tackles cross-problem generalization in neural combinatorial optimization for vehicle routing problems (VRPs) by framing multiple VRP variants as combinations of shared attributes (e.g., Time Windows, Open Routes, Backhauls, Duration Limits) and solving them with a single unified attention-based model. Through an attribute composition block and multi-task reinforcement learning, the model achieves zero-shot generalization to unseen attribute configurations, significantly reducing the need to train separate models per VRP variant. Empirical results across eleven VRPs, CVRPLib benchmarks, and an industry logistics dataset show substantial performance gains, lowering average gaps from over 20% to around 5% on average, with competitive or superior results compared to specialized single-task methods. The approach offers practical value for industry by enabling a scalable, flexible solver that generalizes across diverse routing problems without extensive re-training, with code available at the referenced repository.

Abstract

Vehicle routing problems (VRPs), which can be found in numerous real-world applications, have been an important research topic for several decades. Recently, the neural combinatorial optimization (NCO) approach that leverages a learning-based model to solve VRPs without manual algorithm design has gained substantial attention. However, current NCO methods typically require building one model for each routing problem, which significantly hinders their practical application for real-world industry problems with diverse attributes. In this work, we make the first attempt to tackle the crucial challenge of cross-problem generalization. In particular, we formulate VRPs as different combinations of a set of shared underlying attributes and solve them simultaneously via a single model through attribute composition. In this way, our proposed model can successfully solve VRPs with unseen attribute combinations in a zero-shot generalization manner. Extensive experiments are conducted on eleven VRP variants, benchmark datasets, and industry logistic scenarios. The results show that the unified model demonstrates superior performance in the eleven VRPs, reducing the average gap to around 5% from over 20% in the existing approach and achieving a significant performance boost on benchmark datasets as well as a real-world logistics application. The source code is included in https://github.com/FeiLiu36/MTNCO.
Paper Structure (29 sections, 14 equations, 11 figures, 10 tables)

This paper contains 29 sections, 14 equations, 11 figures, 10 tables.

Figures (11)

  • Figure 1: VRP variants as combinations of attribute blocks. The basic version is known as the Capacitated Vehicle Routing Problem (CVRP). VRP variants can be regarded as extensions of CVRP, encompassing additional attributes. For example, VRPTW extends CVRP by incorporating time windows, while OVRPTW adds an open routes attribute alongside time windows.
  • Figure 2: Unified model extended from attention model. The model is trained on multiple VRPs with diverse attributes. Then it can be used to solve numerous unseen VRPs as any combinations of the attributes involved in the training.
  • Figure 3: A comparison of gaps on eleven VRPs (Left: box plot, Right: radar plot). ST represents the unified model trained with single-task learning on CVRP, ST_all represents the unified model with single-task learning on OVRPBLTW, and MT represents our approach, i.e., the unified model with multi-task learning on five VRPs. ST_FT and MT_FT are the fine-tuning models.
  • Figure 4: A comparison of distributions of different VRPs on two-dimensional reduction space of the decoder hidden layer.
  • Figure 5: Hausdorff distance in reduction space comparing CVRP and OVRPBLTW (All) with other problems
  • ...and 6 more figures