Table of Contents
Fetching ...

Looking Ahead to Avoid Being Late: Solving Hard-Constrained Traveling Salesman Problem

Jingxiao Chen, Ziqin Gong, Minghuan Liu, Jun Wang, Yong Yu, Weinan Zhang

TL;DR

This work tackles hard-constrained TSPTW by introducing MUSLA, a supervised-learning framework that leverages one-step and multi-step look-ahead information as dynamic features to improve solution legality. By augmenting expert datasets with OSLA and MUSLA look-ahead data, and by training a policy that balances feasibility with near-optimality, the approach achieves strong performance on newly constructed Medium and Hard datasets, while offering significant GPU-based speedups over traditional heuristics. The results demonstrate superior balance between legality and optimality compared with RL baselines and establish a practical, data-driven pathway for solving constrained routing problems, though the need for high-quality expert data remains a limitation. The authors also propose MUSLA-adapt to tune legality during inference and present extensive ablations, underscoring the value of dynamic information and dataset diversity for hard-constrained sequence decision problems.

Abstract

Many real-world problems can be formulated as a constrained Traveling Salesman Problem (TSP). However, the constraints are always complex and numerous, making the TSPs challenging to solve. When the number of complicated constraints grows, it is time-consuming for traditional heuristic algorithms to avoid illegitimate outcomes. Learning-based methods provide an alternative to solve TSPs in a soft manner, which also supports GPU acceleration to generate solutions quickly. Nevertheless, the soft manner inevitably results in difficulty solving hard-constrained problems with learning algorithms, and the conflicts between legality and optimality may substantially affect the optimality of the solution. To overcome this problem and to have an effective solution against hard constraints, we proposed a novel learning-based method that uses looking-ahead information as the feature to improve the legality of TSP with Time Windows (TSPTW) solutions. Besides, we constructed TSPTW datasets with hard constraints in order to accurately evaluate and benchmark the statistical performance of various approaches, which can serve the community for future research. With comprehensive experiments on diverse datasets, MUSLA outperforms existing baselines and shows generalizability potential.

Looking Ahead to Avoid Being Late: Solving Hard-Constrained Traveling Salesman Problem

TL;DR

This work tackles hard-constrained TSPTW by introducing MUSLA, a supervised-learning framework that leverages one-step and multi-step look-ahead information as dynamic features to improve solution legality. By augmenting expert datasets with OSLA and MUSLA look-ahead data, and by training a policy that balances feasibility with near-optimality, the approach achieves strong performance on newly constructed Medium and Hard datasets, while offering significant GPU-based speedups over traditional heuristics. The results demonstrate superior balance between legality and optimality compared with RL baselines and establish a practical, data-driven pathway for solving constrained routing problems, though the need for high-quality expert data remains a limitation. The authors also propose MUSLA-adapt to tune legality during inference and present extensive ablations, underscoring the value of dynamic information and dataset diversity for hard-constrained sequence decision problems.

Abstract

Many real-world problems can be formulated as a constrained Traveling Salesman Problem (TSP). However, the constraints are always complex and numerous, making the TSPs challenging to solve. When the number of complicated constraints grows, it is time-consuming for traditional heuristic algorithms to avoid illegitimate outcomes. Learning-based methods provide an alternative to solve TSPs in a soft manner, which also supports GPU acceleration to generate solutions quickly. Nevertheless, the soft manner inevitably results in difficulty solving hard-constrained problems with learning algorithms, and the conflicts between legality and optimality may substantially affect the optimality of the solution. To overcome this problem and to have an effective solution against hard constraints, we proposed a novel learning-based method that uses looking-ahead information as the feature to improve the legality of TSP with Time Windows (TSPTW) solutions. Besides, we constructed TSPTW datasets with hard constraints in order to accurately evaluate and benchmark the statistical performance of various approaches, which can serve the community for future research. With comprehensive experiments on diverse datasets, MUSLA outperforms existing baselines and shows generalizability potential.
Paper Structure (35 sections, 7 equations, 4 figures, 6 tables)

This paper contains 35 sections, 7 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Method pipeline of MUSLA. We labeled expert datasets with LKH3 solutions in order to train a faster learning-based solver. With one-step look-ahead augmented datasets, we trained OSLA policy $\pi^{+1}$. OSLA policy directed further multi-step look-ahead data augmentation, resulting in MUSLA policy $\pi^{+m}$.
  • Figure 2: Illustration of the multi-step look-ahead mechanism for $m=1$. Subfigure (a) shows the route construction at step $i$. Subfigures (b) and (c) illustrate the process of information gathering. Orange nodes have been determined to be in the current route $X_{0:i}$. Blue nodes are temporarily added to the route during the search. Future information is gathered from green points.
  • Figure 3: Weighted score of different models. The dotted lines highlight reasonable ranges of $\gamma$.
  • Figure 4: Network structure of our policy.