TAXI: Traveling Salesman Problem Accelerator with X-bar-based Ising Macros Powered by SOT-MRAMs and Hierarchical Clustering
Sangmin Yoo, Amod Holla, Sourav Sanyal, Dong Eun Kim, Francesca Iacopi, Dwaipayan Biswas, James Myers, Kaushik Roy
TL;DR
TAXI tackles the scalability gap in Ising-based TSP solvers by combining in-memory Xbar Ising macros powered by SOT-MRAM RNGs with a hierarchical clustering strategy that decomposes large TSPs into parallel subproblems. The architecture features a MAC-based energy minimization of the Ising Hamiltonian and a stochastic decision mechanism, enabling fast, energy-efficient annealing directly in memory. Key contributions include a novel W_D distance mapping, a dedicated Ising macro with superposition, distance calculation, stochastic vectors, and an annealing schedule, plus a hierarchical clustering framework with fixed inter-cluster routes and aggressive parallelism mapped to Xbar hardware. Evaluation shows TAXI achieving up to 8× speedups over prior clustering-Ising solvers across 20 TSPLib benchmarks (up to 85,900 cities) with competitive solution quality close to Concorde, demonstrating the practicality of hardware-algorithm co-design for large-scale combinatorial optimization.
Abstract
Ising solvers with hierarchical clustering have shown promise for large-scale Traveling Salesman Problems (TSPs), in terms of latency and energy. However, most of these methods still face unacceptable quality degradation as the problem size increases beyond a certain extent. Additionally, their hardware-agnostic adoptions limit their ability to fully exploit available hardware resources. In this work, we introduce TAXI -- an in-memory computing-based TSP accelerator with crossbar(Xbar)-based Ising macros. Each macro independently solves a TSP sub-problem, obtained by hierarchical clustering, without the need for any off-macro data movement, leading to massive parallelism. Within the macro, Spin-Orbit-Torque (SOT) devices serve as compact energy-efficient random number generators enabling rapid "natural annealing". By leveraging hardware-algorithm co-design, TAXI offers improvements in solution quality, speed, and energy-efficiency on TSPs up to 85,900 cities (the largest TSPLIB instance). TAXI produces solutions that are only 22% and 20% longer than the Concorde solver's exact solution on 33,810 and 85,900 city TSPs, respectively. TAXI outperforms a current state-of-the-art clustering-based Ising solver, being 8x faster on average across 20 benchmark problems from TSPLib.
