Unveiling Code Clones in Quantum Programming: An Empirical Study with Qiskit
Kenta Manoku, Jianjun Zhao
TL;DR
This study addresses the lack of understanding of code clones in quantum programming by empirically analyzing Qiskit-based software. It combines AST-based extraction of functions/classes with a SequenceMatcher-based similarity metric, defined as $Similarity = \frac{2 \times \text{Matching Characters}}{\text{Length of String 1} + \text{Length of String 2}}$, to detect Type-1 through Type-3 clones. The key finding is a notable density of Type-2 and Type-3 clones, with Type-1 clones also present, indicating maintenance challenges in quantum software. These results motivate the development of quantum-specific clone-detection and refactoring tools to improve maintainability and scalability of quantum applications.
Abstract
Code clones, referring to identical or similar code fragments, have long posed challenges in classical programming, impacting software quality, maintainability, and scalability. However, their presence and characteristics in quantum programming remain unexplored. This paper presents an empirical study of code clones in quantum programs, specifically focusing on software developed using the Qiskit framework. We examine the existence, distribution, density, and size of code clones in quantum software, revealing a high density of Type-2 and Type-3 clones involving minor modifications. Our findings suggest that these clones are more frequent in quantum software, likely due to the complexity of quantum algorithms and their integration with classical logic. This highlights the need for advanced clone detection and refactoring tools specifically designed for the quantum domain to improve software maintainability and scalability.
