Table of Contents
Fetching ...

M-DAB: An Input-Distribution Optimization Algorithm for Composite DNA Storage by the Multinomial Channel

Adir Kobovich, Eitan Yaakobi, Nir Weinberger

TL;DR

This work model a DNA storage channel with composite inputs as a multidimensional dynamic assignment Blahut-Arimoto channel, and proposes an optimization algorithm for its capacity achieving input distribution, for an arbitrary number of output reads.

Abstract

Recent experiments have shown that the capacity of DNA storage systems may be significantly increased by synthesizing composite DNA letters. In this work, we model a DNA storage channel with composite inputs as a \textit{multinomial channel}, and propose an optimization algorithm for its capacity achieving input distribution, for an arbitrary number of output reads. The algorithm is termed multidimensional dynamic assignment Blahut-Arimoto (M-DAB), and is a generalized version of the DAB algorithm, proposed by Wesel et al. developed for the binomial channel. We also empirically observe a scaling law behavior of the capacity as a function of the support size of the capacity-achieving input distribution.

M-DAB: An Input-Distribution Optimization Algorithm for Composite DNA Storage by the Multinomial Channel

TL;DR

This work model a DNA storage channel with composite inputs as a multidimensional dynamic assignment Blahut-Arimoto channel, and proposes an optimization algorithm for its capacity achieving input distribution, for an arbitrary number of output reads.

Abstract

Recent experiments have shown that the capacity of DNA storage systems may be significantly increased by synthesizing composite DNA letters. In this work, we model a DNA storage channel with composite inputs as a \textit{multinomial channel}, and propose an optimization algorithm for its capacity achieving input distribution, for an arbitrary number of output reads. The algorithm is termed multidimensional dynamic assignment Blahut-Arimoto (M-DAB), and is a generalized version of the DAB algorithm, proposed by Wesel et al. developed for the binomial channel. We also empirically observe a scaling law behavior of the capacity as a function of the support size of the capacity-achieving input distribution.
Paper Structure (9 sections, 4 theorems, 7 equations, 3 figures, 1 table, 1 algorithm)

This paper contains 9 sections, 4 theorems, 7 equations, 3 figures, 1 table, 1 algorithm.

Key Result

Lemma 1

Consider a channel, with an input $X$ taking values in $\Delta_k$ for some $k>1$, and a discrete finite output alphabet $Y$. Assume the transition probability distribution function $x\to P_{Y|X}(y|x)$ is continuous for each $y\in Y$. Then, there exists a CAID supported on less than $|Y|$ points in $

Figures (3)

  • Figure 1: The first iteration of the M-DAB algorithm for $C_{n=7, k=3}$. The simplex is an equilateral triangle, and the ordered simplex is a right-angled triangle on the bottom-left. There are 3 mass points in the ordered simplex (blue 'o'). The color bar represents the KL divergence value in each point in the simplex, and the maximizer is marked with purple 'x'.
  • Figure 2: The capacity achieved for $C_{n,k=4}$. In blue, our method M-DAB, and in purple, the method used in choi2019high. In black are the upper bounds on any base $4$ and $15$ encoding.
  • Figure 3: The capacity as function of the number of mass points in the minimal support size CAID. The dimensions are represented in different colors. A scaling law can be observed, where the capacity behaves as a logarithm with a factor of $\frac{3}{4}$.

Theorems & Definitions (5)

  • Lemma 1
  • Corollary 1
  • Lemma 2
  • Definition 1
  • Lemma 3