Capacity Approximations for Insertion Channels with Small Insertion Probabilities
Busra Tegin, Tolga M Duman
TL;DR
The paper addresses capacity approximations for binary insertion channels with small insertion probabilities, extending the classic De Bruijn/Dobrushin framework to insertions. By decomposing the mutual information rate and leveraging run-lengths, it establishes the first two terms in the capacity expansions for two insertion models, showing that IID Bernoulli(1/2) inputs achieve these terms and providing matched converses. The main contributions are the explicit formulas and constants for the simple insertion channel (G1 ≈ 0.4901) and the Gallager insertion channel (G2 ≈ −0.5865), along with rigorous upper-bound techniques based on perturbed insertion processes and run-length truncation. These results offer the first rigorous capacity approximations for insertion channels in the small-probability regime, with implications for DNA storage and synchronization-error analysis.
Abstract
Channels with synchronization errors, exhibiting deletion and insertion errors, find practical applications in DNA storage, data reconstruction, and various other domains. Presence of insertions and deletions render the channel with memory, complicating capacity analysis. For instance, despite the formulation of an independent and identically distributed (i.i.d.) deletion channel more than fifty years ago, and proof that the channel is information stable, hence its Shannon capacity exists, calculation of the capacity remained elusive. However, a relatively recent result establishes the capacity of the deletion channel in the asymptotic regime of small deletion probabilities by computing the dominant terms of the capacity expansion. This paper extends that result to binary insertion channels, determining the dominant terms of the channel capacity for small insertion probabilities and establishing capacity in this asymptotic regime. Specifically, we consider two i.i.d. insertion channel models: insertion channel with possible random bit insertions after every transmitted bit and the Gallager insertion model, for which a bit is replaced by two random bits with a certain probability. To prove our results, we build on methods used for the deletion channel, employing Bernoulli(1/2) inputs for achievability and coupling this with a converse using stationary and ergodic processes as inputs, and show that the channel capacity differs only in the higher order terms from the achievable rates with i.i.d. inputs. The results, for instance, show that the capacity of the random insertion channel is higher than that of the Gallager insertion channel, and quantifies the difference in the asymptotic regime.
