CaDA: Cross-Problem Routing Solver with Constraint-Aware Dual-Attention
Han Li, Fei Liu, Zhi Zheng, Yu Zhang, Zhenkun Wang
TL;DR
This work tackles cross-problem vehicle routing with diverse operational constraints by introducing CaDA, a constraint-aware dual-attention neural solver. CaDA augments encoder learning with a constraint prompt and two attention branches (global and Top-k sparse) to balance global context with selective neighbor focus, enabling robust generalization across 16 VRP variants. Empirical results show CaDA achieves state-of-the-art performance and strong scalability, with ablations confirming the importance of the constraint prompt and sparse attention. The approach promises practical impact for real-world routing systems that must adapt to varying constraints without training a separate model for each variant.
Abstract
Vehicle Routing Problems (VRPs) are significant Combinatorial Optimization (CO) problems holding substantial practical importance. Recently, Neural Combinatorial Optimization (NCO), which involves training deep learning models on extensive data to learn vehicle routing heuristics, has emerged as a promising approach due to its efficiency and the reduced need for manual algorithm design. However, applying NCO across diverse real-world scenarios with various constraints necessitates cross-problem capabilities. Current NCO methods typically employ a unified model lacking a constraint-specific structure, thereby restricting their cross-problem performance. Current multi-task methods for VRPs typically employ a constraint-unaware model, limiting their cross-problem performance. Furthermore, they rely solely on global connectivity, which fails to focus on key nodes and leads to inefficient representation learning. This paper introduces a Constraint-Aware Dual-Attention Model (CaDA), designed to address these limitations. CaDA incorporates a constraint prompt that efficiently represents different problem variants. Additionally, it features a dual-attention mechanism with a global branch for capturing broader graph-wide information and a sparse branch that selectively focuses on the most relevant nodes. We comprehensively evaluate our model on 16 different VRPs and compare its performance against existing cross-problem VRP solvers. CaDA achieves state-of-the-art results across all the VRPs. Our ablation study further confirms that each component of CaDA contributes positively to its cross-problem learning performance.
