Table of Contents
Fetching ...

Old and New Results on Alphabetic Codes

Roberto Bruno, Roberto De Prisco, Ugo Vaccaro

TL;DR

The paper surveys alphabetic codes and their deep connections to comparison-based search, detailing classical optimal-code algorithms with their time complexities ($O(n^3)$ for Gilbert–Moore, $O(n^2)$ for Knuth, and $O(n\log n)$ for Hu–Tucker and Garsia–Wachs) and the foundational existence conditions from Yeung, Nakatsu, and Sheinwald, all relative to the entropy bound $H(P)$. It then surveys a broad landscape of variations and generalizations, including height limits, partial orders, AIFV variants, and $k$-ary trees, along with linear-time solutions for special cases and extensions to nonstandard objective functions. The survey synthesizes upper-bounding results on $E[C]$ around $H(P)$ plus corrective terms, highlighting both early bounds and modern refinements (e.g., $E[C]<H(P)+2$, $E[C]\le H(P)+1-p_1-p_n$ for dyadic distributions). Collectively, the work maps a rich set of algorithmic ideas, theoretical conditions, and practical applications across data compression, routing, and search, and it identifies numerous open problems and directions for future research.

Abstract

This comprehensive survey examines the field of alphabetic codes, tracing their development from the 1960s to the present day. We explore classical alphabetic codes and their variants, analyzing their properties and the underlying mathematical and algorithmic principles. The paper covers the fundamental relationship between alphabetic codes and comparison-based search procedures and their applications in data compression, routing, and testing. We review optimal alphabetic code construction algorithms, necessary and sufficient conditions for their existence, and upper bounds on the average code length of optimal alphabetic codes. The survey also discusses variations and generalizations of the classical problem of constructing minimum average length alphabetic codes. By elucidating both classical results and recent findings, this paper aims to serve as a valuable resource for researchers and students, concluding with promising future research directions in this still-active field.

Old and New Results on Alphabetic Codes

TL;DR

The paper surveys alphabetic codes and their deep connections to comparison-based search, detailing classical optimal-code algorithms with their time complexities ( for Gilbert–Moore, for Knuth, and for Hu–Tucker and Garsia–Wachs) and the foundational existence conditions from Yeung, Nakatsu, and Sheinwald, all relative to the entropy bound . It then surveys a broad landscape of variations and generalizations, including height limits, partial orders, AIFV variants, and -ary trees, along with linear-time solutions for special cases and extensions to nonstandard objective functions. The survey synthesizes upper-bounding results on around plus corrective terms, highlighting both early bounds and modern refinements (e.g., , for dyadic distributions). Collectively, the work maps a rich set of algorithmic ideas, theoretical conditions, and practical applications across data compression, routing, and search, and it identifies numerous open problems and directions for future research.

Abstract

This comprehensive survey examines the field of alphabetic codes, tracing their development from the 1960s to the present day. We explore classical alphabetic codes and their variants, analyzing their properties and the underlying mathematical and algorithmic principles. The paper covers the fundamental relationship between alphabetic codes and comparison-based search procedures and their applications in data compression, routing, and testing. We review optimal alphabetic code construction algorithms, necessary and sufficient conditions for their existence, and upper bounds on the average code length of optimal alphabetic codes. The survey also discusses variations and generalizations of the classical problem of constructing minimum average length alphabetic codes. By elucidating both classical results and recent findings, this paper aims to serve as a valuable resource for researchers and students, concluding with promising future research directions in this still-active field.

Paper Structure

This paper contains 21 sections, 12 theorems, 43 equations, 21 figures, 3 tables, 6 algorithms.

Key Result

theorem 1

Let $S=\{s_1, \ldots, s_m\}$ be a set of elements, ordered according to a given total order relation $\prec$, that is, for which it holds that $s_1\prec \dots \prec s_m$. Any algorithm $\cal A$ that successfully determines the value of an arbitrary unknown $x\in S$, by means of the execution of test

Figures (21)

  • Figure 1: Initial forest. Probabilities are multiplied by 100 to ease the drawing and the reading.
  • Figure 2: List of nodes after step 1 of the Hu-Tucker algorithm.
  • Figure 3: List of nodes after step 2 of the Hu-Tucker algorithm.
  • Figure 4: List of nodes after step 3 of the Hu-Tucker algorithm.
  • Figure 5: List of nodes after step 4 of the Hu-Tucker algorithm.
  • ...and 16 more figures

Theorems & Definitions (17)

  • definition 1
  • theorem 1: AWaigner
  • definition 2: yeung1991alphabetic
  • theorem 2: yeung1991alphabetic
  • definition 3
  • definition 4: nakatsu1991bounds
  • theorem 3: nakatsu1991bounds
  • definition 5: sheinwald1992binary
  • theorem 4: sheinwald1992binary
  • theorem 5: depriscopersiano
  • ...and 7 more