Table of Contents
Fetching ...

MAFFT-inspired Quantum Shift-based Sequence Alignment and its Efficient Simulation on Decision Diagrams

Yusuke Kimura, Yutaka Takita

TL;DR

QShift-SA targets the screening steps that often dominate the runtime in classical MAFFT and finds that a decision diagram (DD)-based quantum circuit simulator runs more than 1,000$\times$ faster than state-vector and MPS simulators and can handle larger circuits.

Abstract

Multiple sequence alignment (MSA) is a core operation for comparing genome sequences and is widely used in bio-informatics. MAFFT, a practical MSA tool, repeatedly shifts a pair of sequences and computes a distance. Because the number of sequence pairs grows quadratically with the number of sequences, this procedure can become a bottleneck. We propose Quantum Shift-based Sequence Alignment (QShift-SA), which implements this ``shift-wise score computation'' as a gate-based quantum circuit and searches over shift amounts and sequence pairs using Grover algorithm. QShift-SA constructs an oracle circuit that compute the Hamming distance (the number of mismatches) between two sequences with data encoding, controlled shift, comparison, and addition. This oracle can search for candidates with small distances. QShift-SA does not aim to replace the full MSA workflow; instead, it targets the screening steps that often dominate the runtime in classical MAFFT as stated above. We evaluate circuit resources (number of qubits, gate count, and depth) and benchmark simulation time across multiple quantum circuit simulators. We find that a decision diagram (DD)-based quantum circuit simulator runs more than 1,000$\times$ faster than state-vector and MPS simulators and can handle larger circuits.

MAFFT-inspired Quantum Shift-based Sequence Alignment and its Efficient Simulation on Decision Diagrams

TL;DR

QShift-SA targets the screening steps that often dominate the runtime in classical MAFFT and finds that a decision diagram (DD)-based quantum circuit simulator runs more than 1,000 faster than state-vector and MPS simulators and can handle larger circuits.

Abstract

Multiple sequence alignment (MSA) is a core operation for comparing genome sequences and is widely used in bio-informatics. MAFFT, a practical MSA tool, repeatedly shifts a pair of sequences and computes a distance. Because the number of sequence pairs grows quadratically with the number of sequences, this procedure can become a bottleneck. We propose Quantum Shift-based Sequence Alignment (QShift-SA), which implements this ``shift-wise score computation'' as a gate-based quantum circuit and searches over shift amounts and sequence pairs using Grover algorithm. QShift-SA constructs an oracle circuit that compute the Hamming distance (the number of mismatches) between two sequences with data encoding, controlled shift, comparison, and addition. This oracle can search for candidates with small distances. QShift-SA does not aim to replace the full MSA workflow; instead, it targets the screening steps that often dominate the runtime in classical MAFFT as stated above. We evaluate circuit resources (number of qubits, gate count, and depth) and benchmark simulation time across multiple quantum circuit simulators. We find that a decision diagram (DD)-based quantum circuit simulator runs more than 1,000 faster than state-vector and MPS simulators and can handle larger circuits.
Paper Structure (41 sections, 17 equations, 13 figures, 4 tables)

This paper contains 41 sections, 17 equations, 13 figures, 4 tables.

Figures (13)

  • Figure 1: An overview of shift-wise score computation in MAFFT. After numerically encoding two sequences, Fourier transform, element-wise multiplication in the frequency domain, and inverse transform together yield the score vector $C(k)$ for all shift amounts.
  • Figure 2: Conceptual diagram of Grover search (repeating the oracle and the diffusion operator)
  • Figure 3: Example of a state-vector representation using a decision diagram
  • Figure 4: Encoding circuit for four sequences
  • Figure 5: Conceptual diagram of a cyclic shift circuit controlled by the shift-amount register. The shift amount $k$ is represented in binary, and an arbitrary cyclic shift is synthesized by conditionally applying a SWAP network corresponding to each bit.
  • ...and 8 more figures