GAMA: A Neural Neighborhood Search Method with Graph-aware Multi-modal Attention for Vehicle Routing Problem
Xiangling Chen, Yi Mei, Mengjie Zhang
TL;DR
The paper tackles CVRP with a Learning-to-Improve framework by introducing GAMA, a graph-aware multimodal attention encoder that jointly represents the problem instance and evolving solution. The method uses Dual-GCN streams to encode each modality, followed by stacked self- and cross-attention with a gated fusion mechanism to form a rich state for PPO-based operator selection. Empirical results on synthetic and CVRP benchmarks show that GAMA outperforms strong neural baselines and generalizes well to large-scale, out-of-distribution instances without retraining. This work advances neural VRP solvers by enabling deeper structural understanding and adaptive search control, yielding higher-quality solutions and more robust performance in complex routing scenarios.
Abstract
Recent advances in neural neighborhood search methods have shown potential in tackling Vehicle Routing Problems (VRPs). However, most existing approaches rely on simplistic state representations and fuse heterogeneous information via naive concatenation, limiting their ability to capture rich structural and semantic context. To address these limitations, we propose GAMA, a neural neighborhood search method with Graph-aware Multi-modal Attention model in VRP. GAMA encodes the problem instance and its evolving solution as distinct modalities using graph neural networks, and models their intra- and inter-modal interactions through stacked self- and cross-attention layers. A gated fusion mechanism further integrates the multi-modal representations into a structured state, enabling the policy to make informed and generalizable operator selection decisions. Extensive experiments conducted across various synthetic and benchmark instances demonstrate that the proposed algorithm GAMA significantly outperforms the recent neural baselines. Further ablation studies confirm that both the multi-modal attention mechanism and the gated fusion design play a key role in achieving the observed performance gains.
