Prompt Learning for Generalized Vehicle Routing

Fei Liu; Xi Lin; Weiduo Liao; Zhenkun Wang; Qingfu Zhang; Xialiang Tong; Mingxuan Yuan

Prompt Learning for Generalized Vehicle Routing

Fei Liu, Xi Lin, Weiduo Liao, Zhenkun Wang, Qingfu Zhang, Xialiang Tong, Mingxuan Yuan

TL;DR

The paper tackles cross-distribution vehicle routing by addressing the limitations of in-distribution-trained NCO models. It introduces a prompt-learning framework that freezes a pre-trained attention model and learns a pool of key–prompt pairs, enabling fast zero-shot adaptation to diverse routing instances through automatic key selection and prompt prompting. By decomposing prompts into per-layer subprompts and training with REINFORCE, the method achieves superior generalization both within training distributions and on unseen distributions, while reducing training costs relative to meta-learning approaches. The approach yields practical impact by enabling robust, data-efficient VRP solvers that generalize to real-world distributions with a small, learnable prompt set. Key equations and concepts include the prompt set {P_1,...,P_M} matched to input features via nearest key K_i using Euclidean distance, the decomposition P = {P^{(1)},...,P^{(L)}} into D-token prompts per layer, and REINFORCE optimization with a shared baseline to update the prompts based on trajectory rewards $R(\tau)$. All mathematical notation is kept within $...$ delimiters to ensure clarity for downstream indexing and search tooling.

Abstract

Neural combinatorial optimization (NCO) is a promising learning-based approach to solving various vehicle routing problems without much manual algorithm design. However, the current NCO methods mainly focus on the in-distribution performance, while the real-world problem instances usually come from different distributions. A costly fine-tuning approach or generalized model retraining from scratch could be needed to tackle the out-of-distribution instances. Unlike the existing methods, this work investigates an efficient prompt learning approach in NCO for cross-distribution adaptation. To be concrete, we propose a novel prompt learning method to facilitate fast zero-shot adaptation of a pre-trained model to solve routing problem instances from different distributions. The proposed model learns a set of prompts among various distributions and then selects the best-matched one to prompt a pre-trained attention model for each problem instance. Extensive experiments show that the proposed prompt learning approach facilitates the fast adaptation of pre-trained routing models. It also outperforms existing generalized models on both in-distribution prediction and zero-shot generalization to a diverse set of new tasks. Our code implementation is available online https://github.com/FeiLiu36/PromptVRP.

Prompt Learning for Generalized Vehicle Routing

TL;DR

. All mathematical notation is kept within

delimiters to ensure clarity for downstream indexing and search tooling.

Abstract

Paper Structure (44 sections, 17 equations, 6 figures, 9 tables)

This paper contains 44 sections, 17 equations, 6 figures, 9 tables.

Introduction
Related Works
Neural Combinatorial Optimization (NCO)
NCO for Cross-distribution Routing Problem
Prompt Learning
Prompt Learning for Routing
Problem Formulation
Main Idea and Basic Framework
Feature Extractor
Prompt Engineering
Prompted Model
Encoder
Prompted Encoder
Decoder
Training with Reinforcement Learning
...and 29 more sections

Figures (6)

Figure 1: Three approaches for cross-distribution neural combinatorial optimization. a) Single-distribution Learning: Single-distribution learning focuses on solving problem instances from the same distribution, and hence its performance usually significantly deteriorates for out-of-distribution cases. b) Meta Learning: Meta learning builds a single model to handle problem instances from diverse distributions. It requires a complicated and time-consuming training strategy, while the learning capacity might be limited by the static model structure. c) Prompt Learning (Ours): The proposed prompt learning incorporates a trainable key-prompt pool into a frozen NCO model to tackle different problem instances across diverse distributions. For inference, it can automatically select the most suitable prompt for a given instance, and adjust the prompt-based attention in a zero-shot manner to obtain better performance.
Figure 2: Results with different numbers of top-k prompts.
Figure 3: Selection frequencies of prompts on three different test sets. Blue: Set P, Orange: Set X, Grey: Set XML.
Figure 4: Model structure of our proposed prompt learning method, which consists of three main parts. 1) Feature Extractor: We use a pre-trained encoder to extract the feature for a given input instance, which is defined as the concatenation of multiple MHA outputs for different layers. 2) Prompt Engineering: The most suitable key is selected to match the extracted feature of the input instance, and then its associated prompt will be used to adjust the pre-trained NCO model in a zero-shot manner. 3) Prompted Neural Solver: The prompt embedding is decomposed into $L$ subprompts, of which each one consists of $D$ tokens. Each subprompt will be concatenated into each corresponding layer in the pre-trained encoder. In this way, the pre-trained NCO model is fast adjusted to better tackle the input problem instance.
Figure 5: Illustration of six geometrical distributions used in testing and the uniform distribution.
...and 1 more figures

Prompt Learning for Generalized Vehicle Routing

TL;DR

Abstract

Prompt Learning for Generalized Vehicle Routing

Authors

TL;DR

Abstract

Table of Contents

Figures (6)