Table of Contents
Fetching ...

Worst-case optimal adaptive alphabetic prefix-free coding

Travis Gagie

TL;DR

This work tackles one-pass adaptive alphabetic prefix-free coding by introducing a block-based GM59-inspired scheme that preserves lexicographic order while achieving worst-case optimal time and compression for $\sigma$ up to $o(n^{1/2}/\log n)$. The method builds a sequence of alphabetic codes over blocks, using a distribution that blends the observed symbol frequencies with a uniform baseline, and supports constant-time per-character encoding/decoding via precomputed lookup tables. It proves a tight overall bit bound of $nH + O(n)$ with a refined $n(H+2+o(1)) + O((\sigma \log \max(n,\sigma))^2)$ and $O(n+\sigma \log n)$ total time, giving near-entropy-optimal performance in the stated regime and practical constant-time operations for larger sigma up to $O(n/\log n)$. The results extend the landscape of adaptive alphabetic coding by achieving worst-case optimality in both time and compression under a broad, sublinear-sigma regime, albeit within a theoretical, non-fully-practical framework at present.

Abstract

We give the first algorithm for adaptive alphabetic prefix-free coding that is worst-case optimal in terms of time and compression when $σ\in o \left( \frac{n^{1 / 2}}{\log n} \right)$, where $σ$ is the size of the alphabet and $n$ is the length of the input.

Worst-case optimal adaptive alphabetic prefix-free coding

TL;DR

This work tackles one-pass adaptive alphabetic prefix-free coding by introducing a block-based GM59-inspired scheme that preserves lexicographic order while achieving worst-case optimal time and compression for up to . The method builds a sequence of alphabetic codes over blocks, using a distribution that blends the observed symbol frequencies with a uniform baseline, and supports constant-time per-character encoding/decoding via precomputed lookup tables. It proves a tight overall bit bound of with a refined and total time, giving near-entropy-optimal performance in the stated regime and practical constant-time operations for larger sigma up to . The results extend the landscape of adaptive alphabetic coding by achieving worst-case optimality in both time and compression under a broad, sublinear-sigma regime, albeit within a theoretical, non-fully-practical framework at present.

Abstract

We give the first algorithm for adaptive alphabetic prefix-free coding that is worst-case optimal in terms of time and compression when , where is the size of the alphabet and is the length of the input.

Paper Structure

This paper contains 4 sections, 1 theorem, 8 equations, 1 table.

Key Result

Theorem 1

Our algorithm for adaptive alphabetic prefix-free coding encodes $S [1..n]$ using bits, which is at most about $H + 2$ bits per character when $\sigma \in o \left( \frac{n^{1 / 2}}{\log n} \right)$, and constant time per character for encoding and decoding when $\sigma \in O (n / \log n)$.

Theorems & Definitions (1)

  • Theorem 1