On the Capacity of Insertion Channels for Small Insertion Probabilities
Busra Tegin, Tolga M Duman
TL;DR
This work analyzes the binary insertion channel in the regime of small insertion probability $\alpha$, deriving the capacity expansion $C(\alpha) = 1 + \alpha \log(\alpha) + G_1 \alpha + \mathcal{O}(\alpha^{3/2-\epsilon})$ with $G_1 \approx 0.4901$. The authors decompose the rate via a detailed entropy-based decomposition and compute the leading terms using a run-length framework, a modified insertion process, and a perturbed process to bound ambiguities. Achievability is established using i.i.d. Bernoulli$(1/2)$ inputs, while the converse leverages stationary ergodic inputs and run-length truncation to show the first two terms are tight. The results yield a highly accurate capacity approximation for small $\alpha$, with potential extensions to nonbinary alphabets and related synchronization-error channels, relevant to DNA storage and data reconstruction.
Abstract
Channels with synchronization errors, such as deletion and insertion errors, are crucial in DNA storage, data reconstruction, and other applications. These errors introduce memory to the channel, complicating its capacity analysis. This paper analyzes binary insertion channels for small insertion probabilities, identifying dominant terms in the capacity expansion and establishing capacity in this regime. Using Bernoulli(1/2) inputs for achievability and a converse based on the use of stationary and ergodic processes, we demonstrate that capacity closely aligns with achievable rates using independent and identically distributed (i.i.d.) inputs, differing only in higher-order terms.
