Entropic Bounds on the Average Length of Codes with a Space
Roberto Bruno, Ugo Vaccaro
TL;DR
The paper addresses prefix-free coding with a space symbol restricted to appear only at the end of codewords. It develops a linear-time method to construct almost-optimal space-ending codes, showing their average length is within one unit of the minimum, by exploiting a fundamental link to $k$-ary one-to-one codes and deriving entropy-based bounds. It provides both lower and upper bounds on the average length of optimal space-ending codes in terms of the source entropy $H_k({f p})$ and the alphabet size $k$, including refinements that depend on the largest symbol probability $p_1$. The work clarifies the trade-offs introduced by the space constraint, demonstrates that the constraint’s impact diminishes as $k$ grows, and outlines open questions for achieving truly optimal codes via possibly dynamic-programming approaches across the space-end constraint.
Abstract
We consider the problem of constructing prefix-free codes in which a designated symbol, a space, can only appear at the end of codewords. We provide a linear-time algorithm to construct almost-optimal codes with this property, meaning that their average length differs from the minimum possible by at most one. We obtain our results by uncovering a relation between our class of codes and the class of one-to-one codes. Additionally, we derive upper and lower bounds to the average length of optimal prefix-free codes with a space in terms of the source entropy.
