Tensorized Ant Colony Optimization for GPU Acceleration
Luming Yang, Tao Jiang, Ran Cheng
TL;DR
This work tackles the CPU-limited scalability of Ant Colony Optimization (ACO) for large-scale TSP by introducing TensorACO, a GPU-accelerated framework that tensorizes both the ant system and ant path. Key components include preprocessing to compute a probability transition matrix $\mathbf{M}_p$, tensorized representations of ant movement and path updates with index-based aggregation, and AdaIR for parallel, adaptive city selection. The authors demonstrate substantial speedups up to $1921\times$ over CPU ACO and improved convergence with AdaIR (up to $80\%$ faster and $2\%$ better solution quality), validated on TSPLIB instances within a unified EvoX/EvoX environment. These results highlight TensorACO's potential to enable scalable, high-performance ACO for very large TSP instances and motivate applying tensorized GPU approaches to other combinatorial optimization problems.
Abstract
Ant Colony Optimization (ACO) is renowned for its effectiveness in solving Traveling Salesman Problems, yet it faces computational challenges in CPU-based environments, particularly with large-scale instances. In response, we introduce a Tensorized Ant Colony Optimization (TensorACO) to utilize the advancements of GPU acceleration. As the core, TensorACO fully transforms ant system and ant path into tensor forms, a process we refer to as tensorization. For the tensorization of ant system, we propose a preprocessing method to reduce the computational overhead by calculating the probability transition matrix. In the tensorization of ant path, we propose an index mapping method to accelerate the update of pheromone matrix by replacing the mechanism of sequential path update with parallel matrix operations. Additionally, we introduce an Adaptive Independent Roulette (AdaIR) method to overcome the challenges of parallelizing ACO's selection mechanism on GPUs. Comprehensive experiments demonstrate the superior performance of TensorACO achieving up to 1921$\times$ speedup over standard ACO. Moreover, the AdaIR method further improves TensorACO's convergence speed by 80% and solution quality by 2%. Source codes are available at https://github.com/EMI-Group/tensoraco.
