Compiler for Distributed Quantum Computing: a Reinforcement Learning Approach
Panagiotis Promponas, Akrit Mudvari, Luca Della Chiesa, Paul Polakos, Louis Samuel, Leandros Tassiulas
TL;DR
The paper tackles the challenge of compiling quantum circuits for distributed quantum computing by jointly optimizing entanglement distribution, remote operation scheduling, and local qubit routing to minimize real execution time. It models the problem as an MDP for optimal compilation and proposes a constrained reinforcement learning framework, with DDQN showing strong performance in reducing circuit depth and increasing the likelihood of successful execution before decoherence. Empirical results in a two-QPU setup demonstrate the approach's ability to learn efficient policies under stochastic entanglement generation, outperforming alternative RL methods. The work advances practical DQC by delivering an online, state-aware compiler that adapts to hardware constraints and network dynamics, with potential impact on scalable quantum workloads.
Abstract
The practical realization of quantum programs that require large-scale qubit systems is hindered by current technological limitations. Distributed Quantum Computing (DQC) presents a viable path to scalability by interconnecting multiple Quantum Processing Units (QPUs) through quantum links, facilitating the distributed execution of quantum circuits. In DQC, EPR pairs are generated and shared between distant QPUs, which enables quantum teleportation and facilitates the seamless execution of circuits. A primary obstacle in DQC is the efficient mapping and routing of logical qubits to physical qubits across different QPUs, necessitating sophisticated strategies to overcome hardware constraints and optimize communication. We introduce a novel compiler that, unlike existing approaches, prioritizes reducing the expected execution time by jointly managing the generation and routing of EPR pairs, scheduling remote operations, and injecting SWAP gates to facilitate the execution of local gates. We present a real-time, adaptive approach to compiler design, accounting for the stochastic nature of entanglement generation and the operational demands of quantum circuits. Our contributions are twofold: (i) we model the optimal compiler for DQC using a Markov Decision Process (MDP) formulation, establishing the existence of an optimal algorithm, and (ii) we introduce a constrained Reinforcement Learning (RL) method to approximate this optimal compiler, tailored to the complexities of DQC environments. Our simulations demonstrate that Double Deep Q-Networks (DDQNs) are effective in learning policies that minimize the depth of the compiled circuit, leading to a lower expected execution time and likelihood of successful operation before qubits decohere.
