Learning-Based TSP-Solvers Tend to Be Overly Greedy
Xiayang Li, Shihua Zhang
TL;DR
The paper investigates why learning-based TSP solvers often rely on greedy heuristics by introducing the nearest-neighbor density ρ_n to quantify the extent to which optimal tours align with nearest-neighbor relations. It shows that across random uniform, random normal, and real-world TSPLib instances, ρ_n remains high, suggesting a persistent greedy bias, and it provides an asymptotic lower bound ρ_n ≥ (27−32β)/7 with β ≈ 0.7124. To combat this bias, the authors propose distribution-shift and perturbation-based data augmentation, including scale-free network and drilling-pattern instances, and demonstrate that fine-tuning with augmented data improves generalization on diverse tests. However, they prove fundamental limits: there is no efficient complete generator based on ρ_n unless NP = coNP, and no efficient algorithmic coverage by a polynomially sized solver ensemble unless NP = P, implying that universal neural solvers for TSP are unlikely. The work emphasizes the need for realistic benchmarks and interpretable features to advance AI-powered combinatorial optimization, rather than pursuing universal solvers via data augmentation alone.
Abstract
Deep learning has shown significant potential in solving combinatorial optimization problems such as the Euclidean traveling salesman problem (TSP). However, most training and test instances for existing TSP algorithms are generated randomly from specific distributions like uniform distribution. This has led to a lack of analysis and understanding of the performance of deep learning algorithms in out-of-distribution (OOD) generalization scenarios, which has a close relationship with the worst-case performance in the combinatorial optimization field. For data-driven algorithms, the statistical properties of randomly generated datasets are critical. This study constructs a statistical measure called nearest-neighbor density to verify the asymptotic properties of randomly generated datasets and reveal the greedy behavior of learning-based solvers, i.e., always choosing the nearest neighbor nodes to construct the solution path. Based on this statistical measure, we develop interpretable data augmentation methods that rely on distribution shifts or instance perturbations and validate that the performance of the learning-based solvers degenerates much on such augmented data. Moreover, fine-tuning learning-based solvers with augmented data further enhances their generalization abilities. In short, we decipher the limitations of learning-based TSP solvers tending to be overly greedy, which may have profound implications for AI-empowered combinatorial optimization solvers.
