Table of Contents
Fetching ...

Low-Complexity Vector Source Coding for Discrete Long Sequences with Unknown Distributions

Leah Woldemariam, Hang Liu, Anna Scaglione

TL;DR

This paper takes advantage of the structure that arises within the spatial representation and utilize it to encode run-lengths within this representation using Golomb coding and yields an overall bit rate that nears entropy without a computationally complex encoding algorithm.

Abstract

In this paper, we propose a source coding scheme that represents data from unknown distributions through frequency and support information. Existing encoding schemes often compress data by sacrificing computational efficiency or by assuming the data follows a known distribution. We take advantage of the structure that arises within the spatial representation and utilize it to encode run-lengths within this representation using Golomb coding. Through theoretical analysis, we show that our scheme yields an overall bit rate that nears entropy without a computationally complex encoding algorithm and verify these results through numerical experiments.

Low-Complexity Vector Source Coding for Discrete Long Sequences with Unknown Distributions

TL;DR

This paper takes advantage of the structure that arises within the spatial representation and utilize it to encode run-lengths within this representation using Golomb coding and yields an overall bit rate that nears entropy without a computationally complex encoding algorithm.

Abstract

In this paper, we propose a source coding scheme that represents data from unknown distributions through frequency and support information. Existing encoding schemes often compress data by sacrificing computational efficiency or by assuming the data follows a known distribution. We take advantage of the structure that arises within the spatial representation and utilize it to encode run-lengths within this representation using Golomb coding. Through theoretical analysis, we show that our scheme yields an overall bit rate that nears entropy without a computationally complex encoding algorithm and verify these results through numerical experiments.
Paper Structure (6 sections, 1 theorem, 10 equations, 1 figure, 1 algorithm)

This paper contains 6 sections, 1 theorem, 10 equations, 1 figure, 1 algorithm.

Key Result

Theorem 1

For any $1\leq \ell\leq L-1$ and $N\to \infty$, encoding the RLs of ${\bf s}_\ell[\mathcal{I}_\ell]$ using the proposed method yields a bit length no larger than In particular, the optimal choice of the Golomb parameter $M_\ell$ that leads to eq05 is given by the nearest integer value of $(\ln 2) (1-\sum_{1\leq m\leq \ell} f_m({\bf x}))/f_\ell({\bf x})$.

Figures (1)

  • Figure 1: (a) Bit length per symbol versus the data length $N$ with inputs i.i.d. drawn from the geometric distribution with parameter equal to $0.33$ and with $L=50$. (b) Bit length per symbol versus the data length $N$ for the bimodal distribution. (c) Data compression performance for the bimodal distribution under a varying value of $L$, where the input length $N=10,000$.

Theorems & Definitions (2)

  • Theorem 1
  • proof