Privacy-Aware Single-Nucleotide Polymorphisms (SNPs) using Bilinear Group Accumulators in Batch Mode
William J Buchanan, Sam Grierson, Daniel Uribe
TL;DR
This paper addresses privacy concerns for genomic SNP data by proposing a privacy-aware scheme that hashes SNPs into a bilinear-group accumulator, enabling private, witness-based membership searches by a trusted resolver. It builds on dynamic bilinear-map accumulators with batching (q-SDH security) to support efficient additions, deletions, and batched updates, while providing witness generation and verification. Empirical results on a BLS-12381-based setup show that batching dramatically improves throughput (e.g., 100k SNPs in ~0.87 s with batching) and that witness times remain low (≈0.86 ms to generate, ≈10.9 ms to verify). The framework offers a scalable privacy-preserving alternative to Bloom filters and homomorphic encryption for genomic data queries, with explicit support for removal of data from stores via verifiable proofs.
Abstract
Biometric data is often highly sensitive, and a leak of this data can lead to serious privacy breaches. Some of the most sensitive of this type of data relates to the usage of DNA data on individuals. A leak of this type of data without consent could lead to privacy breaches of data protection laws. Along with this, there have been several recent data breaches related to the leak of DNA information, including from 23andMe and Ancestry. It is thus fundamental that a citizen should have the right to know if their DNA data is contained within a DNA database and ask for it to be removed if they are concerned about its usage. This paper outlines a method of hashing the core information contained within the data stores - known as Single-Nucleotide Polymorphisms (SNPs) - into a bilinear group accumulator in batch mode, which can then be searched by a trusted entity for matches. The time to create the witness proof and to verify were measured at 0.86 ms and 10.90 ms, respectively.
