Towards Generalizable Neural Solvers for Vehicle Routing Problems via Ensemble with Transferrable Local Policy
Chengrui Gao, Haopu Shang, Ke Xue, Dong Li, Chao Qian
TL;DR
The paper tackles omni-generalization in neural Vehicle Routing Problem solvers by proposing ELG, an ensemble of a global construction policy and a transferrable local policy. The local policy restricts attention to a local neighborhood of size $K$ and uses polar coordinates with a lightweight attention network, while the global policy (POMO) is augmented with a normalized distance penalty; both are trained jointly via REINFORCE with a shared baseline. Empirical results on TSPLIB and CVRPLIB show that ELG significantly improves cross-distribution and cross-scale generalization and performs well on large real-world CVRP instances, with ablations confirming the local policy's key role. While ELG achieves strong generalization and practical speedups, it incurs higher latency than some divide-and-conquer methods on extremely large instances, motivating future work on latency reduction and broader CO applications.
Abstract
Machine learning has been adapted to help solve NP-hard combinatorial optimization problems. One prevalent way is learning to construct solutions by deep neural networks, which has been receiving more and more attention due to the high efficiency and less requirement for expert knowledge. However, many neural construction methods for Vehicle Routing Problems~(VRPs) focus on synthetic problem instances with specified node distributions and limited scales, leading to poor performance on real-world problems which usually involve complex and unknown node distributions together with large scales. To make neural VRP solvers more practical, we design an auxiliary policy that learns from the local transferable topological features, named local policy, and integrate it with a typical construction policy (which learns from the global information of VRP instances) to form an ensemble policy. With joint training, the aggregated policies perform cooperatively and complementarily to boost generalization. The experimental results on two well-known benchmarks, TSPLIB and CVRPLIB, of travelling salesman problem and capacitated VRP show that the ensemble policy significantly improves both cross-distribution and cross-scale generalization performance, and even performs well on real-world problems with several thousand nodes.
