Table of Contents
Fetching ...

Zero Knowledge Proof for Multiple Sequence Alignment

Worasait Suwannik

TL;DR

This work addresses protecting the confidentiality of multiple sequence alignments while proving the validity of the result. It proposes a Circom-based verifier circuit and zkSNARKs (Groth16) to certify that the input sequences, the produced alignment, and the alignment score are self-consistent without disclosing the alignment itself, with the public inputs limited to the sequences and the score. The main contributions include a structured circuit design (two consistency checks, a main component) and an end-to-end prove-and-verify workflow, along with an analysis of constraint growth that highlights scalability challenges. The approach enables privacy-preserving sharing and verification of MSA results, offering a pathway toward confidential collaboration in bioinformatics while illustrating current limitations in circuit size and performance.

Abstract

Multiple sequence alignment (MSA) is a fundamental algorithm in bioinformatics. In a situation when the alignment might need to be protected while revealing the other information such the input sequences and the alignment score, zero knowledge proof can be used. In this paper, a validator checks the consistency between the input sequence and the alignment, and between the alignment and the alignment score. The validator is written in Circom language which will be compile into a circuit. Using a zero knowledge prove system called zkSNARK, a cryptographic proof is generates for the circuit and its input. This proof demonstrates that all inputs are consistent without revealing the actual alignment.

Zero Knowledge Proof for Multiple Sequence Alignment

TL;DR

This work addresses protecting the confidentiality of multiple sequence alignments while proving the validity of the result. It proposes a Circom-based verifier circuit and zkSNARKs (Groth16) to certify that the input sequences, the produced alignment, and the alignment score are self-consistent without disclosing the alignment itself, with the public inputs limited to the sequences and the score. The main contributions include a structured circuit design (two consistency checks, a main component) and an end-to-end prove-and-verify workflow, along with an analysis of constraint growth that highlights scalability challenges. The approach enables privacy-preserving sharing and verification of MSA results, offering a pathway toward confidential collaboration in bioinformatics while illustrating current limitations in circuit size and performance.

Abstract

Multiple sequence alignment (MSA) is a fundamental algorithm in bioinformatics. In a situation when the alignment might need to be protected while revealing the other information such the input sequences and the alignment score, zero knowledge proof can be used. In this paper, a validator checks the consistency between the input sequence and the alignment, and between the alignment and the alignment score. The validator is written in Circom language which will be compile into a circuit. Using a zero knowledge prove system called zkSNARK, a cryptographic proof is generates for the circuit and its input. This proof demonstrates that all inputs are consistent without revealing the actual alignment.
Paper Structure (9 sections, 5 figures, 1 table)

This paper contains 9 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Input and output of multiple sequence alignment
  • Figure 2: Validator. Private input has gray background.
  • Figure 3: 2D array of components for checking the consistency of the alignment and the input sequence
  • Figure 4: Zero knowledge proof using Circom
  • Figure 5: Another use case of zero knowledge prove for MSA. Private inputs have gray background.