CHARME: A chain-based reinforcement learning approach for the minor embedding problem
Hoang M. Ngo, Nguyen H K. Do, Minh N. Vu, Tre' R. Jeter, Tamer Kahveci, My T. Thai
TL;DR
This work tackles the NP-hard minor embedding problem critical to quantum annealing performance. It introduces CHARME, a chain-based reinforcement learning framework that uses aGCN policy, a state-transition procedure guaranteeing feasible embeddings, and an Order Exploration strategy to learn effective embedding orders. Empirical results show CHARME achieves lower qubit usage than fast baselines like Minorminer and ATOM, while maintaining competitive runtimes and outperforming OCT in several settings, especially for sparse graphs. The approach demonstrates practical potential for scalable quantum optimization by improving embedding efficiency and training dynamics, with robust performance on both synthetic and real-world QUBO-derived graphs.
Abstract
Quantum annealing (QA) has great potential to solve combinatorial optimization problems efficiently. However, the effectiveness of QA algorithms is heavily based on the embedding of problem instances, represented as logical graphs, into the quantum processing unit (QPU) whose topology is in the form of a limited connectivity graph, known as the minor embedding problem. Because the minor embedding problem is an NP-hard problem~\mbox{\cite{Goodrich2018}}, existing methods for the minor embedding problem suffer from scalability issues when faced with larger problem sizes. In this paper, we propose a novel approach utilizing Reinforcement Learning (RL) techniques to address the minor embedding problem, named CHARME. CHARME includes three key components: a Graph Neural Network (GNN) architecture for policy modeling, a state transition algorithm that ensures solution validity, and an order exploration strategy for effective training. Through comprehensive experiments on synthetic and real-world instances, we demonstrate the efficiency of our proposed order exploration strategy as well as our proposed RL framework, CHARME. In particular, CHARME yields superior solutions in terms of qubit usage compared to fast embedding methods such as Minorminer and ATOM. Moreover, our method surpasses the OCT-based approach, known for its slower runtime but high-quality solutions, in several cases. In addition, our proposed exploration enhances the efficiency of the training of the CHARME framework by providing better solutions compared to the greedy strategy.
