Table of Contents
Fetching ...

Prompt Learning for Generalized Vehicle Routing

Fei Liu, Xi Lin, Weiduo Liao, Zhenkun Wang, Qingfu Zhang, Xialiang Tong, Mingxuan Yuan

TL;DR

The paper tackles cross-distribution vehicle routing by addressing the limitations of in-distribution-trained NCO models. It introduces a prompt-learning framework that freezes a pre-trained attention model and learns a pool of key–prompt pairs, enabling fast zero-shot adaptation to diverse routing instances through automatic key selection and prompt prompting. By decomposing prompts into per-layer subprompts and training with REINFORCE, the method achieves superior generalization both within training distributions and on unseen distributions, while reducing training costs relative to meta-learning approaches. The approach yields practical impact by enabling robust, data-efficient VRP solvers that generalize to real-world distributions with a small, learnable prompt set. Key equations and concepts include the prompt set {P_1,...,P_M} matched to input features via nearest key K_i using Euclidean distance, the decomposition P = {P^{(1)},...,P^{(L)}} into D-token prompts per layer, and REINFORCE optimization with a shared baseline to update the prompts based on trajectory rewards $R(\tau)$. All mathematical notation is kept within $...$ delimiters to ensure clarity for downstream indexing and search tooling.

Abstract

Neural combinatorial optimization (NCO) is a promising learning-based approach to solving various vehicle routing problems without much manual algorithm design. However, the current NCO methods mainly focus on the in-distribution performance, while the real-world problem instances usually come from different distributions. A costly fine-tuning approach or generalized model retraining from scratch could be needed to tackle the out-of-distribution instances. Unlike the existing methods, this work investigates an efficient prompt learning approach in NCO for cross-distribution adaptation. To be concrete, we propose a novel prompt learning method to facilitate fast zero-shot adaptation of a pre-trained model to solve routing problem instances from different distributions. The proposed model learns a set of prompts among various distributions and then selects the best-matched one to prompt a pre-trained attention model for each problem instance. Extensive experiments show that the proposed prompt learning approach facilitates the fast adaptation of pre-trained routing models. It also outperforms existing generalized models on both in-distribution prediction and zero-shot generalization to a diverse set of new tasks. Our code implementation is available online https://github.com/FeiLiu36/PromptVRP.

Prompt Learning for Generalized Vehicle Routing

TL;DR

The paper tackles cross-distribution vehicle routing by addressing the limitations of in-distribution-trained NCO models. It introduces a prompt-learning framework that freezes a pre-trained attention model and learns a pool of key–prompt pairs, enabling fast zero-shot adaptation to diverse routing instances through automatic key selection and prompt prompting. By decomposing prompts into per-layer subprompts and training with REINFORCE, the method achieves superior generalization both within training distributions and on unseen distributions, while reducing training costs relative to meta-learning approaches. The approach yields practical impact by enabling robust, data-efficient VRP solvers that generalize to real-world distributions with a small, learnable prompt set. Key equations and concepts include the prompt set {P_1,...,P_M} matched to input features via nearest key K_i using Euclidean distance, the decomposition P = {P^{(1)},...,P^{(L)}} into D-token prompts per layer, and REINFORCE optimization with a shared baseline to update the prompts based on trajectory rewards . All mathematical notation is kept within delimiters to ensure clarity for downstream indexing and search tooling.

Abstract

Neural combinatorial optimization (NCO) is a promising learning-based approach to solving various vehicle routing problems without much manual algorithm design. However, the current NCO methods mainly focus on the in-distribution performance, while the real-world problem instances usually come from different distributions. A costly fine-tuning approach or generalized model retraining from scratch could be needed to tackle the out-of-distribution instances. Unlike the existing methods, this work investigates an efficient prompt learning approach in NCO for cross-distribution adaptation. To be concrete, we propose a novel prompt learning method to facilitate fast zero-shot adaptation of a pre-trained model to solve routing problem instances from different distributions. The proposed model learns a set of prompts among various distributions and then selects the best-matched one to prompt a pre-trained attention model for each problem instance. Extensive experiments show that the proposed prompt learning approach facilitates the fast adaptation of pre-trained routing models. It also outperforms existing generalized models on both in-distribution prediction and zero-shot generalization to a diverse set of new tasks. Our code implementation is available online https://github.com/FeiLiu36/PromptVRP.
Paper Structure (44 sections, 17 equations, 6 figures, 9 tables)

This paper contains 44 sections, 17 equations, 6 figures, 9 tables.

Figures (6)

  • Figure 1: Three approaches for cross-distribution neural combinatorial optimization. a) Single-distribution Learning: Single-distribution learning focuses on solving problem instances from the same distribution, and hence its performance usually significantly deteriorates for out-of-distribution cases. b) Meta Learning: Meta learning builds a single model to handle problem instances from diverse distributions. It requires a complicated and time-consuming training strategy, while the learning capacity might be limited by the static model structure. c) Prompt Learning (Ours): The proposed prompt learning incorporates a trainable key-prompt pool into a frozen NCO model to tackle different problem instances across diverse distributions. For inference, it can automatically select the most suitable prompt for a given instance, and adjust the prompt-based attention in a zero-shot manner to obtain better performance.
  • Figure 2: Results with different numbers of top-k prompts.
  • Figure 3: Selection frequencies of prompts on three different test sets. Blue: Set P, Orange: Set X, Grey: Set XML.
  • Figure 4: Model structure of our proposed prompt learning method, which consists of three main parts. 1) Feature Extractor: We use a pre-trained encoder to extract the feature for a given input instance, which is defined as the concatenation of multiple MHA outputs for different layers. 2) Prompt Engineering: The most suitable key is selected to match the extracted feature of the input instance, and then its associated prompt will be used to adjust the pre-trained NCO model in a zero-shot manner. 3) Prompted Neural Solver: The prompt embedding is decomposed into $L$ subprompts, of which each one consists of $D$ tokens. Each subprompt will be concatenated into each corresponding layer in the pre-trained encoder. In this way, the pre-trained NCO model is fast adjusted to better tackle the input problem instance.
  • Figure 5: Illustration of six geometrical distributions used in testing and the uniform distribution.
  • ...and 1 more figures