Unsupervised Learning for Solving the Travelling Salesman Problem

Yimeng Min; Yiwei Bai; Carla P. Gomes

Unsupervised Learning for Solving the Travelling Salesman Problem

Yimeng Min, Yiwei Bai, Carla P. Gomes

TL;DR

UTSP tackles the NP-hard Travelling Salesman Problem with an unsupervised framework that builds a heat map over edges using a Scattering Attention Graph Neural Network. A differentiable surrogate loss couples a short-path objective with a Hamiltonian Cycle constraint, enabling end-to-end learning without labeled solutions, followed by a non-autoregressive heat-map–guided local search. Empirically, UTSP matches or surpasses existing data-driven TSP heuristics, especially on large instances (up to $n=1000$), while using far fewer training samples and parameters, and significantly reducing the search space via a non-smooth heat map. The approach demonstrates that expressive GNNs and a carefully designed surrogate can yield strong, data-efficient combinatorial optimization heuristics with practical impact for scalable TSP solving.

Abstract

We propose UTSP, an unsupervised learning (UL) framework for solving the Travelling Salesman Problem (TSP). We train a Graph Neural Network (GNN) using a surrogate loss. The GNN outputs a heat map representing the probability for each edge to be part of the optimal path. We then apply local search to generate our final prediction based on the heat map. Our loss function consists of two parts: one pushes the model to find the shortest path and the other serves as a surrogate for the constraint that the route should form a Hamiltonian Cycle. Experimental results show that UTSP outperforms the existing data-driven TSP heuristics. Our approach is parameter efficient as well as data efficient: the model takes $\sim$ 10\% of the number of parameters and $\sim$ 0.2\% of training samples compared with reinforcement learning or supervised learning methods.

Unsupervised Learning for Solving the Travelling Salesman Problem

TL;DR

), while using far fewer training samples and parameters, and significantly reducing the search space via a non-smooth heat map. The approach demonstrates that expressive GNNs and a carefully designed surrogate can yield strong, data-efficient combinatorial optimization heuristics with practical impact for scalable TSP solving.

Abstract

10\% of the number of parameters and

0.2\% of training samples compared with reinforcement learning or supervised learning methods.

Paper Structure (25 sections, 4 theorems, 11 equations, 5 figures, 4 tables)

This paper contains 25 sections, 4 theorems, 11 equations, 5 figures, 4 tables.

Introduction
Our Model
Methodologies
Graph Neural Network
Building the Heat Map using the soft indicator matrix
$\mathbb{T} \rightarrow \mathcal{H}$ transformation
Unsupervised Loss
Edge Elimination
Local Search
Heat Map Guided Best-first Local Search
Updating the Heat Map
Leveraging Randomness
Experiments
Dataset
Results
...and 10 more sections

Key Result

Lemma D.1

Let $q_i$ denote the row index of the non-zero element in $i$-th column in $\mathbb{T}$, $\mathbb{T}_{q_i,i}=1$, $q_i \in \{1,2,3,4,...,n\}$. When each row and column in $\mathbb{T}$ have one value 1 (True) and $n-1$ value 0 (False), then $q_i = q_j$ if and only if $i = j$.

Figures (5)

Figure 1: We use a SAG to generate a non-smooth soft indicator matrix $\mathbb{T}$. The SAG model is a function of coordinates and weighted adjacency matrix. We then build the heat map $\mathcal{H}$ based on $\mathbb{T}$ using the transformation in Equation \ref{['eq:TH']}.
Figure 2: TSP $100$ training curve using Unsupervised Learning surrogate loss. We compare two GNN models: GCN kipf2016semi and SAG min2022can, where GCN is a low-pass model and SAG is a low-pass + band-pass model.
Figure 3: Left: The heat map $\mathcal{H}$ generated using GCN on TSP 100. The diagonal elements are set to 0. $X$-axis and $y$-axis are the city indices, right: The heat map $\mathcal{H}$ generated using SAG on TSP 100. The diagonal elements are set to 0. $X$-axis and $y$-axis are the city indices.
Figure 4: Left:Average edge overlap coefficient $\eta$ w.r.t. training epochs using SAG and GCN on TSP 100 ($M=10$), right: Number of fully covered instances w.r.t. training epochs using SAG and GCN on TSP 100. The validation set consists of 1,000 samples ($M=10$).
Figure 5: Illustration of building a cycle from transition matrix $\mathbb{T}$. Here $q_{k-1} = 3$, $q_k = n -2$, $q_{k+1} = 2$, $q_{n-1} = 1$, $q_n = n$ and $q_1 = 4$, this means $\mathcal{H}$ contains the following four directed edges: $3\rightarrow n-2$, $n-2 \rightarrow 2$, $1 \rightarrow n$ and $n \rightarrow 4$.

Theorems & Definitions (8)

Lemma D.1
proof
Lemma D.2
proof
Lemma D.3
proof
Corollary D.4
proof

Unsupervised Learning for Solving the Travelling Salesman Problem

TL;DR

Abstract

Unsupervised Learning for Solving the Travelling Salesman Problem

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (8)