Robust Composite DNA Storage under Sampling Randomness, Substitution, and Insertion-Deletion Errors
Busra Tegin, Tolga M Duman
TL;DR
This work introduces composite DNA letters defined by a quadruple of nucleotide probabilities on a 3D probability simplex and models DNA storage as a multinomial channel with sampling randomness. By deriving transition probabilities and LLRs for constellation points, and encoding data with LDPC codes, the authors demonstrate robust error protection; they extend the framework to substitution and insertion–deletion errors via constellation update rules. Numerical results show that practical LDPC codes achieve near-zero BLER with moderate sampling and remain effective under realistic substitution and ID scenarios. The approach leverages the increased capacity of composite letters while maintaining compatibility with standard channel codes, offering a practical path for high-density, reliable DNA data storage.
Abstract
DNA data storage offers a high-density, long-term alternative to traditional storage systems, addressing the exponential growth of digital data. Composite DNA extends this paradigm by leveraging mixtures of nucleotides to increase storage capacity beyond the four standard bases. In this work, we model composite DNA storage as a multinomial channel and draw an analogy to digital modulation by representing composite letters on the three-dimensional probability simplex. To mitigate errors caused by sampling randomness, we derive transition probabilities and log-likelihood ratios (LLRs) for each constellation point and employ practical channel codes for error correction. We then extend this framework to substitution and insertion-deletion (ID) channels, proposing constellation update rules that account for these additional impairments. Numerical results demonstrate that our approach achieves reliable performance with existing LDPC codes, compared to the prior schemes designed for limited-magnitude probability errors, whose performance degrades significantly under sampling randomness.
