Compiler for Distributed Quantum Computing: a Reinforcement Learning Approach

Panagiotis Promponas; Akrit Mudvari; Luca Della Chiesa; Paul Polakos; Louis Samuel; Leandros Tassiulas

Compiler for Distributed Quantum Computing: a Reinforcement Learning Approach

Panagiotis Promponas, Akrit Mudvari, Luca Della Chiesa, Paul Polakos, Louis Samuel, Leandros Tassiulas

TL;DR

The paper tackles the challenge of compiling quantum circuits for distributed quantum computing by jointly optimizing entanglement distribution, remote operation scheduling, and local qubit routing to minimize real execution time. It models the problem as an MDP for optimal compilation and proposes a constrained reinforcement learning framework, with DDQN showing strong performance in reducing circuit depth and increasing the likelihood of successful execution before decoherence. Empirical results in a two-QPU setup demonstrate the approach's ability to learn efficient policies under stochastic entanglement generation, outperforming alternative RL methods. The work advances practical DQC by delivering an online, state-aware compiler that adapts to hardware constraints and network dynamics, with potential impact on scalable quantum workloads.

Abstract

The practical realization of quantum programs that require large-scale qubit systems is hindered by current technological limitations. Distributed Quantum Computing (DQC) presents a viable path to scalability by interconnecting multiple Quantum Processing Units (QPUs) through quantum links, facilitating the distributed execution of quantum circuits. In DQC, EPR pairs are generated and shared between distant QPUs, which enables quantum teleportation and facilitates the seamless execution of circuits. A primary obstacle in DQC is the efficient mapping and routing of logical qubits to physical qubits across different QPUs, necessitating sophisticated strategies to overcome hardware constraints and optimize communication. We introduce a novel compiler that, unlike existing approaches, prioritizes reducing the expected execution time by jointly managing the generation and routing of EPR pairs, scheduling remote operations, and injecting SWAP gates to facilitate the execution of local gates. We present a real-time, adaptive approach to compiler design, accounting for the stochastic nature of entanglement generation and the operational demands of quantum circuits. Our contributions are twofold: (i) we model the optimal compiler for DQC using a Markov Decision Process (MDP) formulation, establishing the existence of an optimal algorithm, and (ii) we introduce a constrained Reinforcement Learning (RL) method to approximate this optimal compiler, tailored to the complexities of DQC environments. Our simulations demonstrate that Double Deep Q-Networks (DDQNs) are effective in learning policies that minimize the depth of the compiled circuit, leading to a lower expected execution time and likelihood of successful operation before qubits decohere.

Compiler for Distributed Quantum Computing: a Reinforcement Learning Approach

TL;DR

Abstract

Paper Structure (20 sections, 10 equations, 8 figures)

This paper contains 20 sections, 10 equations, 8 figures.

Introduction
Contributions
Preliminaries
Quantum Gates & Quantum Teleportation
Quantum Processing Units Architecture
Distributed Quantum Computing Architecture
Quantum compilers - Optimality Through an MDP
Initial Qubit Mapping and Qubit Routing
Optimal Compiler for a Single QPU: MDP Formulation
Optimal Compiler for DQC: MDP Formulation
Reinforcement Learning Implementation
Efficient formulation and constrained RL
Learning Agent and RL approaches
Discussion: Model Extensions
Numerical Results
...and 5 more sections

Figures (8)

Figure 1: A circuit comprising 7 qubits and exclusively CNOT operations as specified.
Figure 2: (a) Illustrates an EPR pair shared between two QPUs which can be used to teleport gates and qubits, (b) illustrates the state of the QPUs after a gate teleportation operation, while (c) shows the state of the QPUs after a qubit teleportation.
Figure 3: The coupling graph of the IBM Q Guadalupe quantum processor. This processor's type is Falcon r4P and can hold up to $16$ qubits. We refer interested readers to ibm2024quantum, where IBM provides a list of the quantum processors and their corresponding coupling graphs.
Figure 4: Illustration of one possible compilation of the circuit illustrated in Figure \ref{['fig:quantum_circuit']} for IBM Q Guadalupe (see Figure \ref{['fig:qpu_architecture_guadalupe_16_falcon']}).
Figure 5: Overview of the learning Agent (DDRL-based example) in the constrained RL environment.
...and 3 more figures

Compiler for Distributed Quantum Computing: a Reinforcement Learning Approach

TL;DR

Abstract

Compiler for Distributed Quantum Computing: a Reinforcement Learning Approach

Authors

TL;DR

Abstract

Table of Contents

Figures (8)