Transpiling quantum circuits by a transformers-based algorithm
Michele Banfi, Paolo Zentilini, Sebastiano Corli, Enrico Prati
TL;DR
This work treats quantum circuit transpilation as a seq2seq translation task by employing an encoder–decoder transformer operating on OpenQASM representations. A RegEx-driven tokenizer discretizes continuous rotation angles into symbolic tokens, enabling robust mapping between IBM and IonQ gate sets with cross-attention guiding output generation. The model attains fidelity exceeding $99.98\%$ for up to five-qubit circuits and demonstrates scaling considerations for continuous versus Solovay-Kitaev–decomposed gates, highlighting the need for larger context windows for long sequences. The approach offers a scalable, automated route for cross-platform quantum compilation, with implications for hardware-aware optimization on HPC infrastructures.
Abstract
Transformers have gained popularity in machine learning due to their application in the field of natural language processing. They manipulate and process text efficiently, capturing long-range dependencies among data and performing the next word prediction. On the other hand, gate-based quantum computing is based on controlling the register of qubits in the quantum hardware by applying a sequence of gates, a process which can be interpreted as a low level text programming language. We develop a transformer model capable of transpiling quantum circuits from the qasm standard to other sets of gates native suited for a specific target quantum hardware, in our case the set for the trapped-ion quantum computers of IonQ. The feasibility of a translation up to five qubits is demonstrated with a percentage of correctly transpiled target circuits equal or superior to 99.98%. Regardless the depth of the register and the number of gates applied, we prove that the complexity of the transformer model scales, in the worst case scenario, with a polynomial trend by increasing the depth of the register and the length of the circuit, allowing models with a higher number of parameters to be efficiently trained on HPC infrastructures.
