The Geometry of Codes for Random Access in DNA Storage
Anina Gruica, Maria Montanucci, Ferdinando Zullo
TL;DR
The paper addresses the Random Access Problem in DNA storage by introducing a geometric framework based on balanced quasi-arcs in projective planes, enabling explicit control over the random-access performance of codes. It develops a k=3 construction with provably lower random-access expectations than prior work and, using a key result from Gruica et al., proves a rate-$1/2$ construction achieves an asymptotically sublinear random-access burden, specifically showing $\\lim_{k\to\infty} \\mathbb{E}[\\tau_F(\\mathcal{G}_k)]/k \le 0.945599655$, below the conjectured threshold. Central to the results are closed-form formulas for the recovery parameters $\\alpha_F$ and $\\alpha_N$, the Gruica balance formula for $\\mathbb{E}[\\tau_P(\\mathcal{G})]$, and a detailed asymptotic analysis of balanced quasi-arcs and their multiplicities. The work advances practical DNA storage by offering concrete geometric code constructions with provable improvements in random-access efficiency and provides a roadmap for extending these ideas to larger dimensions and multi-point retrievals.
Abstract
Effective and reliable data retrieval is critical for the feasibility of DNA storage, and the development of random access efficiency plays a key role in its practicality and reliability. In this paper, we study the Random Access Problem, which asks to compute the expected number of samples one needs in order to recover an information strand. Unlike previous work, we took a geometric approach to the problem, aiming to understand which geometric structures lead to codes that perform well in terms of reducing the random access expectation (Balanced Quasi-Arcs). As a consequence, two main results are obtained. The first is a construction for $k=3$ that outperforms previous constructions aiming to reduce the random access expectation. The second, exploiting a result from~\cite{gruica2024reducing}, is the proof of a conjecture from~\cite{bar2023cover} for rate $1/2$ codes in any dimension.
