Table of Contents
Fetching ...

Error Exponents for DNA Storage Codes with a Variable Number of Reads

Yan Hao Ling, Nir Weinberger, Jonathan Scarlett

TL;DR

This work studies error exponents for DNA storage codes when the number of reads is allowed to be variable and chosen adaptively by the decoder. Leveraging a concatenated, index-based inner/outer coding framework, the authors prove an achievability result showing the error probability decays as $P_e \le p^{(\delta+o(1))M}$ with average reads $\overline{N} \le (c+o(1))M$, yielding an exponent of order $\Theta(M\log(1/p))$. In certain parameter regimes, they also establish matching converse bounds under both strong and weak adversaries, demonstrating that the same $c$ bounds the required average reads $\overline{N}$ from below, i.e., $\overline{N} \ge (c-o(1))M$. The results indicate that variable-length reads can dramatically improve reliability without a proportional increase in sampling cost, aligning the variable-read performance with a broader class of fixed-read rates while achieving a faster error decay. This work advances DNA storage theory by quantifying the benefits of adaptive sampling and providing rigorous achievability and converse guarantees under adversarial sequencing errors.

Abstract

In this paper, we study error exponents for a concatataned coding based class of DNA storage codes in which the number of reads performed can be variable. That is, the decoder can sequentially perform reads and choose whether to output the final decision or take more reads, and we are interested in minimizing the average number of reads performed rather than a fixed pre-specified value. We show that this flexibility leads to a considerable reduction in the error probability compared to a fixed number of reads, not only in terms of constants in the error exponent but also in the scaling laws. This is shown via an achievability result for a suitably-designed protocol, and in certain parameter regimes we additionally establish a matching converse that holds for all protocols within a broader concatenated coding based class.

Error Exponents for DNA Storage Codes with a Variable Number of Reads

TL;DR

This work studies error exponents for DNA storage codes when the number of reads is allowed to be variable and chosen adaptively by the decoder. Leveraging a concatenated, index-based inner/outer coding framework, the authors prove an achievability result showing the error probability decays as with average reads , yielding an exponent of order . In certain parameter regimes, they also establish matching converse bounds under both strong and weak adversaries, demonstrating that the same bounds the required average reads from below, i.e., . The results indicate that variable-length reads can dramatically improve reliability without a proportional increase in sampling cost, aligning the variable-read performance with a broader class of fixed-read rates while achieving a faster error decay. This work advances DNA storage theory by quantifying the benefits of adaptive sampling and providing rigorous achievability and converse guarantees under adversarial sequencing errors.

Abstract

In this paper, we study error exponents for a concatataned coding based class of DNA storage codes in which the number of reads performed can be variable. That is, the decoder can sequentially perform reads and choose whether to output the final decision or take more reads, and we are interested in minimizing the average number of reads performed rather than a fixed pre-specified value. We show that this flexibility leads to a considerable reduction in the error probability compared to a fixed number of reads, not only in terms of constants in the error exponent but also in the scaling laws. This is shown via an achievability result for a suitably-designed protocol, and in certain parameter regimes we additionally establish a matching converse that holds for all protocols within a broader concatenated coding based class.

Paper Structure

This paper contains 21 sections, 11 theorems, 55 equations, 1 figure.

Key Result

Theorem 7

Let $c = \log \frac{1}{1-R_0-\delta}$ for any $\delta \in (0,1-R_0)$. Under Assumptions def:adversary1 and def:adversary2, there exists a protocol with error probability $P_e \le p^{(\delta + o(1)) M}$ and an average number of reads $\overline{N} \le (c + o(1))\cdot M$ when the message is chosen uni

Figures (1)

  • Figure 1: Achievable error exponents for various $R_0$ values (dashed), and a matching converse in the low-$\delta$ regime (solid).

Theorems & Definitions (24)

  • Definition 1
  • Definition 2
  • Definition 3
  • Theorem 7
  • Theorem 8
  • Theorem 9
  • Lemma 10
  • proof
  • Lemma 11
  • proof
  • ...and 14 more