Table of Contents
Fetching ...

Achievable Rates and Error Probability Bounds of Frequency-based Channels of Unlimited Input Resolution

Ran Tamir, Nir Weinberger

TL;DR

This work analyzes a molecular frequency-based channel with unlimited input resolution, where codewords are PMFs over an expanding set of object types and reading is performed by sampling with replacement. It develops two achievability bounds by adapting random coding and expurgation techniques to a non-memoryless, high-dimensional channel, yielding explicit exponents E_r(R,r_n) and E_ex(R,r_n) that describe error decay and achievable rates as the number of types n grows. The results show that positive rates are achievable in this unlimited-input regime and that expurgation improves performance at low rates, with comparisons to finite-input bounds illuminating fundamental differences between unlimited- and finite-resolution models. The analysis provides practical insight for high-density DNA storage and related molecular channels, and highlights open directions, such as removing technical Dirichlet restrictions and handling identification noise.

Abstract

We consider a molecular channel, in which messages are encoded to the frequency of objects in a pool, and whose output during reading time is a noisy version of the input frequencies, as obtained by sampling with replacement from the pool. Motivated by recent DNA storage techniques, we focus on the regime in which the input resolution is unlimited. We propose two error probability bounds for this channel; the first bound is based on random coding analysis of the error probability of the maximum likelihood decoder and the second bound is derived by code expurgation techniques. We deduce an achievable bound on the capacity of this channel, and compare it to both the achievable bounds under limited input resolution, as well as to a converse bound.

Achievable Rates and Error Probability Bounds of Frequency-based Channels of Unlimited Input Resolution

TL;DR

This work analyzes a molecular frequency-based channel with unlimited input resolution, where codewords are PMFs over an expanding set of object types and reading is performed by sampling with replacement. It develops two achievability bounds by adapting random coding and expurgation techniques to a non-memoryless, high-dimensional channel, yielding explicit exponents E_r(R,r_n) and E_ex(R,r_n) that describe error decay and achievable rates as the number of types n grows. The results show that positive rates are achievable in this unlimited-input regime and that expurgation improves performance at low rates, with comparisons to finite-input bounds illuminating fundamental differences between unlimited- and finite-resolution models. The analysis provides practical insight for high-density DNA storage and related molecular channels, and highlights open directions, such as removing technical Dirichlet restrictions and handling identification noise.

Abstract

We consider a molecular channel, in which messages are encoded to the frequency of objects in a pool, and whose output during reading time is a noisy version of the input frequencies, as obtained by sampling with replacement from the pool. Motivated by recent DNA storage techniques, we focus on the regime in which the input resolution is unlimited. We propose two error probability bounds for this channel; the first bound is based on random coding analysis of the error probability of the maximum likelihood decoder and the second bound is derived by code expurgation techniques. We deduce an achievable bound on the capacity of this channel, and compare it to both the achievable bounds under limited input resolution, as well as to a converse bound.

Paper Structure

This paper contains 12 sections, 5 theorems, 82 equations, 2 figures.

Key Result

Theorem 1

Assume that $\{r_n\}_{n \geq 1}$ is a sequence that grows sub-exponentially fast. Then,

Figures (2)

  • Figure 1: The exponent functions $E_{\hbox{\scriptsize r}}(R,r_n)$ and $E_{\hbox{\scriptsize ex}}(R,r_n)$ as functions of $R$ for $r_n=400$.
  • Figure 2: A comparison between $R_{\hbox{\tiny LB}}(r_n)$, $R_{\hbox{\tiny LB}}^{\hbox{\tiny FIR}}(g_n,r_n)$ for three $g_n$ values, and the converse expression $\frac{1}{2}\log(r_n)$.

Theorems & Definitions (5)

  • Theorem 1
  • Lemma 2
  • Theorem 3
  • Theorem 4
  • Proposition 5