Table of Contents
Fetching ...

The Capacity of the Weighted Read Channel

Omer Yerushalmi, Tuvi Etzion, Eitan Yaakobi

TL;DR

The paper analyzes the capacity of the $(\ell,\delta)$-read channel, motivated by nanopore sequencing, and provides an explicit formula in the regime $\delta < \ell < 2\delta$: $\mathsf{cap}(\ell,\delta) = \frac{1}{\delta} \log_2 \frac{\ell+1 + \sqrt{(\ell+1)^2 - 4(\ell-\delta)(\ell-\delta+1)}}{2}$. It also derives an upper bound for $\ell \ge 2\delta$ using a constrained automaton with $m$ states, giving $\mathsf{cap}(\ell,\delta) \le \frac{1}{\delta} \log_2 \frac{m-1 + \sqrt{(m-1)^2 + 4(m-1)}}{2}$ with $m = (\ell \bmod \delta)((-\ell) \bmod \delta) + \delta$. The work extends the one-dimensional model to two dimensions, introducing a $((\ell_1,\ell_2),(\delta_1,\delta_2))$-read matrix and showing that, via mappings $\mu$ and $\lambda$, 2D capacity can be related to 1D capacity and, in many regimes, reduces to $1$D capacity expressions or to scaled binary capacities. These results provide exact formulas and practical bounds for ISI-affected read channels and inform code constructions for DNA storage systems employing nanopore-like reading mechanisms.

Abstract

One of the primary sequencing methods gaining prominence in DNA storage is nanopore sequencing, attributed to various factors. In this work, we consider a simplified model of the sequencer, characterized as a channel. This channel takes a sequence and processes it using a sliding window of length $\ell$, shifting the window by $δ$ characters each time. The output of this channel, which we refer to as the read vector, is a vector containing the sums of the entries in each of the windows. The capacity of the channel is defined as the maximal information rate of the channel. Previous works have already revealed capacity values for certain parameters $\ell$ and $δ$. In this work, we show that when $δ< \ell < 2δ$, the capacity value is given by $\frac{1}δ\log_2 \frac{1}{2}(\ell+1+ \sqrt{(\ell+1)^2 - 4(\ell - δ)(\ell-δ+1)})$. Additionally, we construct an upper bound when $2δ< \ell$. Finally, we extend the model to the two-dimensional case and present several results on its capacity.

The Capacity of the Weighted Read Channel

TL;DR

The paper analyzes the capacity of the -read channel, motivated by nanopore sequencing, and provides an explicit formula in the regime : . It also derives an upper bound for using a constrained automaton with states, giving with . The work extends the one-dimensional model to two dimensions, introducing a -read matrix and showing that, via mappings and , 2D capacity can be related to 1D capacity and, in many regimes, reduces to D capacity expressions or to scaled binary capacities. These results provide exact formulas and practical bounds for ISI-affected read channels and inform code constructions for DNA storage systems employing nanopore-like reading mechanisms.

Abstract

One of the primary sequencing methods gaining prominence in DNA storage is nanopore sequencing, attributed to various factors. In this work, we consider a simplified model of the sequencer, characterized as a channel. This channel takes a sequence and processes it using a sliding window of length , shifting the window by characters each time. The output of this channel, which we refer to as the read vector, is a vector containing the sums of the entries in each of the windows. The capacity of the channel is defined as the maximal information rate of the channel. Previous works have already revealed capacity values for certain parameters and . In this work, we show that when , the capacity value is given by . Additionally, we construct an upper bound when . Finally, we extend the model to the two-dimensional case and present several results on its capacity.
Paper Structure (5 sections, 13 theorems, 76 equations, 6 figures, 1 table)

This paper contains 5 sections, 13 theorems, 76 equations, 6 figures, 1 table.

Key Result

Theorem 1

Let $\ell, \delta$ be integers, for every integer $q \geq 2$ it holds that,

Figures (6)

  • Figure 1: The graph $\mathcal{G}(\ell, \delta)$ for $\ell = 5$ and $\delta=3$.
  • Figure 2: The determinizing graph $\mathcal{H}(5, 3)$ of $\mathcal{G}(5,3)$ in Fig. \ref{['fig:non-det-graph']}.
  • Figure 3: The graph $\mathcal{H}^*(5, 3)$ of $\mathcal{H}(5,3)$ from Fig. \ref{['fig:det-graph-1']}.
  • Figure 4: Example for $\ell = 8$ and $\delta=3$. The second vector is the value gets by applying $\phi_{\ell, \delta}^n$ on the first vector.
  • Figure 5: Example for the function $g$. The first vector represents the original vector, and the second vector is the result after applying the function $g$.
  • ...and 1 more figures

Theorems & Definitions (37)

  • Definition 1
  • Example 1
  • Definition 2
  • Theorem 1
  • Theorem 2: TR-CODE
  • Definition 3
  • Example 2
  • Definition 4
  • Definition 5
  • Example 3
  • ...and 27 more