Optimized Compilation for Distributed Quantum Computing

Michele Bandini; Davide Ferrari; Stefano Carretta; Michele Amoretti

Optimized Compilation for Distributed Quantum Computing

Michele Bandini, Davide Ferrari, Stefano Carretta, Michele Amoretti

TL;DR

This work focuses on minimizing the use of EPR pairs when the circuit structure allows for multiple non-local gates to utilize a single TeleGate operation, and it is shown that this approach brings benefits even while assuming a low EPR pair lifetime.

Abstract

In many practical applications, quantum algorithms require several qubits, significantly more than those available with current noisy intermediate-scale quantum processors. Distributed quantum computing (DQC) is considered a scalable approach to increasing the number of available qubits for computational tasks. In the DQC setting, a quantum compiler must find the best partitioning for the quantum algorithm and then perform smart non-local operations scheduling to optimize the consumption of Einstein-Podolsky-Rosen (EPR) pairs. In this work, the focus is on minimizing the use of EPR pairs when the circuit structure allows for multiple non-local gates to utilize a single TeleGate operation. This is achieved by using a greedy algorithm that explores the circuit and groups together the gates that could share an EPR pair while also changing the order of commutative gates when necessary. With this preliminary pass, the compiled circuits show reduced depth and EPR usage. Since the quality of each EPR pair quickly deteriorates, the number of non-local gates using the same EPR pair should also be bounded. This means that, depending on the features of the target quantum network, the user can achieve different levels of optimization. Here, it is shown that this approach brings benefits even while assuming a low EPR pair lifetime.

Optimized Compilation for Distributed Quantum Computing

TL;DR

Abstract

Paper Structure (13 sections, 12 figures, 2 tables, 1 algorithm)

This paper contains 13 sections, 12 figures, 2 tables, 1 algorithm.

Introduction
DQC System Model
Assumptions
Implementation of Non-local Quantum Gates
Compilation Process
Qubit Assignment
Non-local Gate Grouping
Algorithm Breakdown
Optimizations when Entanglement Duration is Limited
Non-Local Gate Reordering
Non-local Gate Scheduling
Experimental Evaluation
Conclusions

Figures (12)

Figure 1: (\ref{['fig:tele-data']}) Circuit representation of TeleData by means of the Teleport primitive. TeleData moves the quantum state of a data qubit $\ket{c}$ to one qubit of an EPR pair and, then, swap it to a free data qubit. The state of the EPR pair and $\ket{c}$ are lost in the process. Multiple CZ acting on the teleported qubit can then be executed. (\ref{['fig:tele-gate']}) Circuit representation of TeleGate by means of Cat-Ent and Cat-DisEnt primitives. After the Cat-Ent operation, the second qubit of the EPR pair participates in an entangled state with the control qubit. Multiple CZ with same control qubit and different target can be executed between Cat-Ent and Cat-DisEnt. It is worth noting that, between Cat-Ent and Cat-DisEnt, the control qubit is entangled with the EPR pair's one and cannot be targeted by other gates.
Figure 2: Flowchart describing the compilation process. The input includes an abstract description of a quantum circuit and a high-level description of the network configuration. The mandatory passes are qubit assignment, non-local gate scheduling, local mapping and local routing. Optional passes are non-local gate grouping and non-local gate reordering. The passes concerning non-local gates are outlined, as they are the novel contribution of this work.
Figure 3: Example of qubit assignment. (a) Input circuit. (b) Baseline qubit assignment as in Ferrari2023. (c) Qubit assignment with non-local gate grouping.
Figure 4: Commutation rules
Figure 5: Number of required EPR pairs with respect to the number of QPUs, for selected quantum circuits with very different features. In each plot, different compiler configurations are compared. For (a) and (b), the four considered DQC architectures include devices with 64, 32, 16 and 8 data qubits, respectively. For (c) and (d), instead, the four considered DQC architectures include devices with 256, 128, 64 and 32 qubits, respectively. Each device has 4 communication qubits.
...and 7 more figures

Optimized Compilation for Distributed Quantum Computing

TL;DR

Abstract

Optimized Compilation for Distributed Quantum Computing

Authors

TL;DR

Abstract

Table of Contents

Figures (12)