Table of Contents
Fetching ...

Overlapped Arithmetic Codes

Yong Fang

TL;DR

Overlapped Arithmetic Codes extend classical arithmetic coding by mapping source symbols to partially overlapping sub-intervals, enabling distributed and joint codes through coset-based interval partitions. The paper develops a rigorous CCS framework to analyze and quantify how source-space partitioning into unequal cosets affects encoding/decoding, rate loss, and decoding complexity, and it links this with an asymptotic spectrum f(u) and a suite of practical decoding strategies. It introduces backward-recursive CCS computations, intrinsic/extrinsic path metrics for decoding, and coexisting-interval analysis to derive error rates under partial knowledge of the block. The work further connects CCS to the Hamming Distance Spectrum, proposes soft/hard/fast approximations to HDS, and presents experimental validation, illustrating significant gains in low-complexity decoding scenarios and establishing a rich mathematical foundation for overlapped arithmetic codes in distributed and joint coding settings.

Abstract

Arithmetic codes are usually deemed as the most important means to implement lossless source coding, whose principle is mapping every source symbol to a sub-interval in [0, 1). For every source symbol, the length of its mapping sub-interval is exactly equal to its probability. With this symbol-interval mapping rule, the interval [0,1) will be fully covered and there is neither overlapped sub-interval (corresponds to more than one source symbol) nor forbidden sub-interval (does not correspond to any source symbol). It is well-known that there is a duality between source coding and channel coding, so every good source code may also be a good channel code meanwhile, and vice versa. Inspired by this duality, arithmetic codes can be easily generalized to address many coding problems beyond source coding by redefining the source-interval mapping rule. If every source symbol is mapped to an enlarged sub-interval, the mapping sub-intervals of different source symbols will be partially overlapped and we obtain overlapped arithmetic codes, which can realize distributed source coding. On the contrary, if every source symbol is mapped to a narrowed sub-interval, there will be one or more forbidden sub-intervals in [0, 1) that do not correspond to any source symbol and we obtain forbidden arithmetic codes, which can implement joint source-channel coding. Furthermore, by allowing the coexistence of overlapped sub-intervals and forbidden sub-intervals, we will obtain hybrid arithmetic codes, which can cope with distributed joint source-channel coding.

Overlapped Arithmetic Codes

TL;DR

Overlapped Arithmetic Codes extend classical arithmetic coding by mapping source symbols to partially overlapping sub-intervals, enabling distributed and joint codes through coset-based interval partitions. The paper develops a rigorous CCS framework to analyze and quantify how source-space partitioning into unequal cosets affects encoding/decoding, rate loss, and decoding complexity, and it links this with an asymptotic spectrum f(u) and a suite of practical decoding strategies. It introduces backward-recursive CCS computations, intrinsic/extrinsic path metrics for decoding, and coexisting-interval analysis to derive error rates under partial knowledge of the block. The work further connects CCS to the Hamming Distance Spectrum, proposes soft/hard/fast approximations to HDS, and presents experimental validation, illustrating significant gains in low-complexity decoding scenarios and establishing a rich mathematical foundation for overlapped arithmetic codes in distributed and joint coding settings.

Abstract

Arithmetic codes are usually deemed as the most important means to implement lossless source coding, whose principle is mapping every source symbol to a sub-interval in [0, 1). For every source symbol, the length of its mapping sub-interval is exactly equal to its probability. With this symbol-interval mapping rule, the interval [0,1) will be fully covered and there is neither overlapped sub-interval (corresponds to more than one source symbol) nor forbidden sub-interval (does not correspond to any source symbol). It is well-known that there is a duality between source coding and channel coding, so every good source code may also be a good channel code meanwhile, and vice versa. Inspired by this duality, arithmetic codes can be easily generalized to address many coding problems beyond source coding by redefining the source-interval mapping rule. If every source symbol is mapped to an enlarged sub-interval, the mapping sub-intervals of different source symbols will be partially overlapped and we obtain overlapped arithmetic codes, which can realize distributed source coding. On the contrary, if every source symbol is mapped to a narrowed sub-interval, there will be one or more forbidden sub-intervals in [0, 1) that do not correspond to any source symbol and we obtain forbidden arithmetic codes, which can implement joint source-channel coding. Furthermore, by allowing the coexistence of overlapped sub-intervals and forbidden sub-intervals, we will obtain hybrid arithmetic codes, which can cope with distributed joint source-channel coding.

Paper Structure

This paper contains 63 sections, 78 theorems, 298 equations, 26 figures, 7 tables, 8 algorithms.

Key Result

Theorem 1.1

Let $X$ be a discrete random variable with finite alphabet ${\cal X}$. Let $x^n\triangleq(x_1,\dots,x_n)$ be $n$ independent realizations of $X$. To compress $x^n$ with zero loss, the achievable rate $R$ (bits/symbol) is lower bounded by $H(X)$ as block length $n\to\infty$.

Figures (26)

  • Figure 1: Four typical frameworks of source coding. (a) Symmetric CSC. (b) Symmetric DSC. (c) Asymmetric CSC. (d) Asymmetric DSC.
  • Figure 2: Diagram of the DJSCC problem for two terminals.
  • Figure 3: Generalized arithmetic codes for binary sources, where $p\triangleq\Pr(X=1)$. (a) Standard arithmetic codes to handle source coding. (b) Overlapped arithmetic codes to handle DSC (cf. \ref{['subfig:dscsym']} and \ref{['subfig:dscasym']}). (c), (d), and (e) Three variants of forbidden arithmetic codes to handle channel coding ($p=1/2$) or joint source-channel coding ($p\neq 1/2$), where $\alpha = (1-p)^r+p^r<1$. (f) Hybrid arithmetic codes to handle the DJSCC problem (cf. \ref{['fig:djscc']}).
  • Figure 4: An example of the mapping from $x^n\in\mathbb{B}^n$ to $[l,h)\subset[0,1)$, where $n=3$ and $p=\Pr(X=1)=1/3$.
  • Figure 5: Evolution of sliding window $[\lambda:\eta]$ for different source blocks, where $n=3$, $p=1/3$, and $w=8$. Initially, $[\lambda:\eta]=[0:255]$. Depending on $x_i$, $[\lambda:\eta]$ is shrunk (and then renormalized, if needed). The bitstring on the left of $[\lambda:\eta]$, if any, contains the bits output by the encoder so far, and the number on the right of $[\lambda:\eta]$, if any, is the number of underflow bits currently.
  • ...and 21 more figures

Theorems & Definitions (160)

  • Definition 1.1: Entropy, Conditional Entropy, Joint Entropy, and Mutual Information
  • Theorem 1.1: Shannon's First Theorem
  • Definition 1.2: Channel Capacity
  • Theorem 1.2: Shannon's Second Theorem
  • Theorem 1.3: Slepian-Wolf Theorem
  • Theorem 1.4: Noisy-Channel Slepian-Wolf Coding
  • Example 1.1
  • Definition 1.3: Normalized Interval
  • Lemma 1.5: Properties of Normalized Interval
  • Definition 1.4: Raw Arithmetic Bitstream
  • ...and 150 more