Zero Knowledge Proof for Multiple Sequence Alignment
Worasait Suwannik
TL;DR
This work addresses protecting the confidentiality of multiple sequence alignments while proving the validity of the result. It proposes a Circom-based verifier circuit and zkSNARKs (Groth16) to certify that the input sequences, the produced alignment, and the alignment score are self-consistent without disclosing the alignment itself, with the public inputs limited to the sequences and the score. The main contributions include a structured circuit design (two consistency checks, a main component) and an end-to-end prove-and-verify workflow, along with an analysis of constraint growth that highlights scalability challenges. The approach enables privacy-preserving sharing and verification of MSA results, offering a pathway toward confidential collaboration in bioinformatics while illustrating current limitations in circuit size and performance.
Abstract
Multiple sequence alignment (MSA) is a fundamental algorithm in bioinformatics. In a situation when the alignment might need to be protected while revealing the other information such the input sequences and the alignment score, zero knowledge proof can be used. In this paper, a validator checks the consistency between the input sequence and the alignment, and between the alignment and the alignment score. The validator is written in Circom language which will be compile into a circuit. Using a zero knowledge prove system called zkSNARK, a cryptographic proof is generates for the circuit and its input. This proof demonstrates that all inputs are consistent without revealing the actual alignment.
