Table of Contents
Fetching ...

MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts

Jianan Zhou, Zhiguang Cao, Yaoxin Wu, Wen Song, Yining Ma, Jie Zhang, Chi Xu

TL;DR

This work tackles the challenge of solving multiple VRP variants with a single neural model. It introduces MVMoE, a mixture-of-experts architecture that places MoE layers in both the encoder and decoder, coupled with a hierarchical gating mechanism to balance accuracy and computational cost. The model is trained on randomly sampled VRP variants, enabling strong zero-shot generalization to unseen configurations and robust few-shot performance. Empirical results show substantial improvements over baselines on unseen VRPs and real-world benchmarks, with the hierarchical gating offering better out-of-distribution generalization and efficiency. This approach advances the practical applicability of neural VRP solvers by providing a foundation model-like, multi-task capability for combinatorial optimization problems.

Abstract

Learning to solve vehicle routing problems (VRPs) has garnered much attention. However, most neural solvers are only structured and trained independently on a specific problem, making them less generic and practical. In this paper, we aim to develop a unified neural solver that can cope with a range of VRP variants simultaneously. Specifically, we propose a multi-task vehicle routing solver with mixture-of-experts (MVMoE), which greatly enhances the model capacity without a proportional increase in computation. We further develop a hierarchical gating mechanism for the MVMoE, delivering a good trade-off between empirical performance and computational complexity. Experimentally, our method significantly promotes zero-shot generalization performance on 10 unseen VRP variants, and showcases decent results on the few-shot setting and real-world benchmark instances. We further conduct extensive studies on the effect of MoE configurations in solving VRPs, and observe the superiority of hierarchical gating when facing out-of-distribution data. The source code is available at: https://github.com/RoyalSkye/Routing-MVMoE.

MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts

TL;DR

This work tackles the challenge of solving multiple VRP variants with a single neural model. It introduces MVMoE, a mixture-of-experts architecture that places MoE layers in both the encoder and decoder, coupled with a hierarchical gating mechanism to balance accuracy and computational cost. The model is trained on randomly sampled VRP variants, enabling strong zero-shot generalization to unseen configurations and robust few-shot performance. Empirical results show substantial improvements over baselines on unseen VRPs and real-world benchmarks, with the hierarchical gating offering better out-of-distribution generalization and efficiency. This approach advances the practical applicability of neural VRP solvers by providing a foundation model-like, multi-task capability for combinatorial optimization problems.

Abstract

Learning to solve vehicle routing problems (VRPs) has garnered much attention. However, most neural solvers are only structured and trained independently on a specific problem, making them less generic and practical. In this paper, we aim to develop a unified neural solver that can cope with a range of VRP variants simultaneously. Specifically, we propose a multi-task vehicle routing solver with mixture-of-experts (MVMoE), which greatly enhances the model capacity without a proportional increase in computation. We further develop a hierarchical gating mechanism for the MVMoE, delivering a good trade-off between empirical performance and computational complexity. Experimentally, our method significantly promotes zero-shot generalization performance on 10 unseen VRP variants, and showcases decent results on the few-shot setting and real-world benchmark instances. We further conduct extensive studies on the effect of MoE configurations in solving VRPs, and observe the superiority of hierarchical gating when facing out-of-distribution data. The source code is available at: https://github.com/RoyalSkye/Routing-MVMoE.
Paper Structure (24 sections, 19 equations, 9 figures, 8 tables)

This paper contains 24 sections, 19 equations, 9 figures, 8 tables.

Figures (9)

  • Figure 1: Illustrations of sub-tours with various constraints: open route (O), backhaul (B), duration limit (L), and time window (TW).
  • Figure 2: The model structure of MVMoE. [Green part]: Given an input instance, the encoder and decoder output node embeddings and probabilities of nodes to be selected, respectively. The gray nodes are masked to satisfy problem-specific constraints for feasibility. The node with a deeper color denote a later node embedding. [Yellow part]: In an MoE layer, where we take the (node-level) input-choice Top2 gating as an example, the input $x$ (i.e., node) is routed to two experts that derive the two largest probabilities from the gating network $G$.
  • Figure 3: An illustration of the score matrix and gating algorithm. Left panel: Input-choice gating. Right panel: Expert-choice gating. The selected experts or nodes are in color. The arrow marks the dimension, along which the Top$K$ experts or nodes are selected.
  • Figure 4: A base gating (i.e., the input-choice gating with $K=2$) and its hierarchical gating counterpart. In the latter, the gating network $G_1$ routes inputs to the sparse layer ($\{G_2, E_1, E_2, E_3, E_4\}$) or the dense layer $D$. If the sparse layer is chosen, the gating network $G_2$ routes nodes to experts accoring to the base gating.
  • Figure 5: Few-shot generalization on unseen VRPs.
  • ...and 4 more figures