Efficient decomposition of unitary matrices in quantum circuit compilers
A. M. Krol, A. Sarkar, I. Ashraf, Z. Al-Ars, K. Bertels
TL;DR
The paper tackles the challenge of efficiently decomposing arbitrary unitary gates into a universal gate set for quantum circuit execution. It implements Quantum Shannon Decomposition (QSD) within the OpenQL framework and demonstrates significant improvements in circuit length and decomposition speed over prior methods such as Qubiter. Key contributions include a robust OpenQL implementation of QSD with optimization opportunities that exploit matrix structure (multiplexers and unaffected qubits), along with comprehensive empirical comparisons of gate counts, execution time, and memory usage. These results have practical impact for running arbitrary-unitary quantum algorithms on simulators and future hardware, including genome-analysis workflows like QiBAM and QAM, and set the stage for further optimizations toward near-term quantum devices.
Abstract
Unitary decomposition is a widely used method to map quantum algorithms to an arbitrary set of quantum gates. Efficient implementation of this decomposition allows for translation of bigger unitary gates into elementary quantum operations, which is key to executing these algorithms on existing quantum computers. The decomposition can be used as an aggressive optimization method for the whole circuit, as well as to test part of an algorithm on a quantum accelerator. For selection and implementation of the decomposition algorithm, perfect qubits are assumed. We base our decomposition technique on Quantum Shannon Decomposition which generates O((3/4)*4^n) controlled-not gates for an n-qubit input gate. The resulting circuits are up to 10 times shorter than other methods in the field. When comparing our implementation to Qubiter, we show that our implementation generates circuits with half the number of CNOT gates and a third of the total circuit length. In addition to that, it is also up to 10 times as fast. Further optimizations are proposed to take advantage of potential underlying structure in the input or intermediate matrices, as well as to minimize the execution time of the decomposition.
