Compare Similarities Between DNA Sequences Using Permutation-Invariant Quantum Kernel
Chenyu Shi, Gabriele Leoni, Mauro Petrillo, Antonio Puertas Gallardo, Hao Wang
TL;DR
This work tackles DNA sequence similarity under the NP-Complete EDM metric by proposing a permutation-invariant variational quantum kernel that uses SIC-POVM encoding to map the four nucleotides to mutually equidistant quantum states. The kernel enforces EDM-like symmetry via a permutation-invariant parameterized circuit and enhances expressiveness with data re-uploading, enabling polynomial-time inference on short sequences. In simulations, the method achieves higher order accuracy than classical deep kernel baselines while using orders of magnitude fewer trainable parameters, highlighting the potential of symmetry-informed quantum kernels for bioscience tasks such as AMR gene detection. Limitations include scalability to longer sequences on NISQ hardware and the need for more expressive architectures; future work could extend to direct permutation-invariant kernels and broader downstream applications.
Abstract
Computing the similarity between two DNA sequences is of vital importance in bioscience, yet it can be computationally expensive on classical hardware. For example, the edit distance with move operations (EDM), a DNA similarity measure of interest in biology, is proven to be NP-Complete to compute exactly on classical hardware. Recently, applied quantum algorithms have been anticipated to offer potential advantages over classical approaches. In this paper, we propose a novel variational quantum kernel model served as a surrogate model for estimating similarity between DNA sequences defined by EDM. Since the EDM metric exhibits a pairwise permutation-insensitive property, we incorporate a permutation-invariant structure into the variational quantum kernel to approximate this symmetry. Furthermore, to encode the four nucleotide bases as quantum states, we introduce a theoretically motivated encoding scheme based on symmetric informationally complete positive operator-valued measure (SIC-POVM) states. This encoding ensures mutual equivalence among bases, as each pair of symbols is mapped to quantum states that are equidistant on the Bloch sphere. We experimentally show that, equipped with the permutation-invariant circuit design and mutual-equivalence encoding, the proposed quantum kernel model achieves strong performance in approximating the similarity defined by EDM. Compared with classical kernel learning methods, our quantum approach achieves significantly higher accuracy while using substantially fewer trainable parameters.
