Solving the QAP by Two-Stage Graph Pointer Networks and Reinforcement Learning
Satoko Iida, Ryota Yasudo
TL;DR
The paper tackles solving the NP-hard Quadratic Assignment Problem (QAP) by framing it within neural combinatorial optimization. It extends Graph Pointer Networks (GPN) to handle matrix-input TSP and introduces a two-stage GPN architecture for QAP that uses a Distance-Flow Product (DFP) representation and a block-wise solving strategy. Empirical results show the approach yields semi-optimal solutions faster than traditional heuristics, though performance varies by instance type, especially in sparse or triangular cases. The work contributes a scalable, reinforcement-learning-based solver and releases code for broader use and benchmarking.
Abstract
Quadratic Assignment Problem (QAP) is a practical combinatorial optimization problems that has been studied for several years. Since it is NP-hard, solving large problem instances of QAP is challenging. Although heuristics can find semi-optimal solutions, the execution time significantly increases as the problem size increases. Recently, solving combinatorial optimization problems by deep learning has been attracting attention as a faster solver than heuristics. Even with deep learning, however, solving large QAP is still challenging. In this paper, we propose the deep reinforcement learning model called the two-stage graph pointer network (GPN) for solving QAP. Two-stage GPN relies on GPN, which has been proposed for Euclidean Traveling Salesman Problem (TSP). First, we extend GPN for general TSP, and then we add new algorithms to that model for solving QAP. Our experimental results show that our two-stage GPN provides semi-optimal solutions for benchmark problem instances from TSPlib and QAPLIB.
