Bridging Synthetic and Real Routing Problems via LLM-Guided Instance Generation and Progressive Adaptation
Jianghan Zhu, Yaoxin Wu, Zhuoyi Lin, Zhengyuan Zhang, Haiyan Yin, Zhiguang Cao, Senthilnath Jayavelu, Xiaoli Li
TL;DR
EvoReal tackles the generalization gap of neural VRP solvers when moving from synthetic, uniform data to real-world VRP benchmarks (TSPLib and CVRPLib). It introduces an LLM-guided evolutionary framework that designs and evolves data generators to produce structurally realistic VRP instances, followed by a two-phase progressive fine-tuning of pre-trained neural solvers to align with real distributions and scales. The approach yields state-of-the-art generalization across problem sizes, significantly reducing gaps to optimal on TSPLib ($1.05\%$) and CVRPLib ($2.71\%$), without changing model architectures. This data-centric method demonstrates the power of leveraging LLMs for distributional alignment and offers a practical pathway to deploying neural VRP solvers in real-world settings.
Abstract
Recent advances in Neural Combinatorial Optimization (NCO) methods have significantly improved the capability of neural solvers to handle synthetic routing instances. Nonetheless, existing neural solvers typically struggle to generalize effectively from synthetic, uniformly-distributed training data to real-world VRP scenarios, including widely recognized benchmark instances from TSPLib and CVRPLib. To bridge this generalization gap, we present Evolutionary Realistic Instance Synthesis (EvoReal), which leverages an evolutionary module guided by large language models (LLMs) to generate synthetic instances characterized by diverse and realistic structural patterns. Specifically, the evolutionary module produces synthetic instances whose structural attributes statistically mimics those observed in authentic real-world instances. Subsequently, pre-trained NCO models are progressively refined, firstly aligning them with these structurally enriched synthetic distributions and then further adapting them through direct fine-tuning on actual benchmark instances. Extensive experimental evaluations demonstrate that EvoReal markedly improves the generalization capabilities of state-of-the-art neural solvers, yielding a notable reduced performance gap compared to the optimal solutions on the TSPLib (1.05%) and CVRPLib (2.71%) benchmarks across a broad spectrum of problem scales.
