Prompt Learning for Generalized Vehicle Routing
Fei Liu, Xi Lin, Weiduo Liao, Zhenkun Wang, Qingfu Zhang, Xialiang Tong, Mingxuan Yuan
TL;DR
The paper tackles cross-distribution vehicle routing by addressing the limitations of in-distribution-trained NCO models. It introduces a prompt-learning framework that freezes a pre-trained attention model and learns a pool of key–prompt pairs, enabling fast zero-shot adaptation to diverse routing instances through automatic key selection and prompt prompting. By decomposing prompts into per-layer subprompts and training with REINFORCE, the method achieves superior generalization both within training distributions and on unseen distributions, while reducing training costs relative to meta-learning approaches. The approach yields practical impact by enabling robust, data-efficient VRP solvers that generalize to real-world distributions with a small, learnable prompt set. Key equations and concepts include the prompt set {P_1,...,P_M} matched to input features via nearest key K_i using Euclidean distance, the decomposition P = {P^{(1)},...,P^{(L)}} into D-token prompts per layer, and REINFORCE optimization with a shared baseline to update the prompts based on trajectory rewards $R(\tau)$. All mathematical notation is kept within $...$ delimiters to ensure clarity for downstream indexing and search tooling.
Abstract
Neural combinatorial optimization (NCO) is a promising learning-based approach to solving various vehicle routing problems without much manual algorithm design. However, the current NCO methods mainly focus on the in-distribution performance, while the real-world problem instances usually come from different distributions. A costly fine-tuning approach or generalized model retraining from scratch could be needed to tackle the out-of-distribution instances. Unlike the existing methods, this work investigates an efficient prompt learning approach in NCO for cross-distribution adaptation. To be concrete, we propose a novel prompt learning method to facilitate fast zero-shot adaptation of a pre-trained model to solve routing problem instances from different distributions. The proposed model learns a set of prompts among various distributions and then selects the best-matched one to prompt a pre-trained attention model for each problem instance. Extensive experiments show that the proposed prompt learning approach facilitates the fast adaptation of pre-trained routing models. It also outperforms existing generalized models on both in-distribution prediction and zero-shot generalization to a diverse set of new tasks. Our code implementation is available online https://github.com/FeiLiu36/PromptVRP.
