Table of Contents
Fetching ...

A Family of Low-Complexity Binary Codes with Constant Hamming Weights

Birenjith Sasidharan, Emanuele Viterbo, Son Hoang Dau

TL;DR

This work designs binary constant-weight codes with d=2 and M=2^k that admit low-complexity encoding and decoding. It introduces anchor-decodable finite sequences to parameterize a family of codes, notably C[ell], with n=2^ell and w=ell, achieving high combinatorial dimension k_ell and linear encoding alongside poly-log decoding, without binomial coefficient computations. The paper proves optimality properties for the main construction, presents an alternate second construction that achieves the same dimension for small ell, and extends the framework with derived codes that widen weight and blocklength ranges while preserving low complexity. Collectively, these contributions yield practically efficient constant-weight coding schemes with strong theoretical bounds and flexible parameterization for diverse applications.

Abstract

In this paper, we focus on the design of binary constant weight codes that admit low-complexity encoding and decoding algorithms, and that have a size $M=2^k$. For every integer $\ell \geq 3$, we construct a $(n=2^\ell, M=2^{k_{\ell}}, d=2)$ constant weight code ${\cal C}[\ell]$ of weight $\ell$ by encoding information in the gaps between successive $1$'s. The code is associated with an integer sequence of length $\ell$ with a constraint defined as {\em anchor-decodability} that ensures low complexity for encoding and decoding. The complexity of the encoding is linear in the input size $k$, and that of the decoding is poly-logarithmic in the input size $n$, discounting the linear time spent on parsing the input. Both the algorithms do not require expensive computation of binomial coefficients, unlike the case in many existing schemes. Among codes generated by all anchor-decodable sequences, we show that ${\cal C}[\ell]$ has the maximum size with $k_{\ell} \geq \ell^2-\ell\log_2\ell + \log_2\ell - 0.279\ell - 0.721$. As $k$ is upper bounded by $\ell^2-\ell\log_2\ell +O(\ell)$ information-theoretically, the code ${\cal C}[\ell]$ is optimal in its size with respect to two higher order terms of $\ell$. In particular, $k_\ell$ meets the upper bound for $\ell=3$ and one-bit away for $\ell=4$. On the other hand, we show that ${\cal C}[\ell]$ is not unique in attaining $k_{\ell}$ by constructing an alternate code ${\cal \hat{C}}[\ell]$ again parameterized by an integer $\ell \geq 3$ with a different low-complexity decoder, yet having the same size $2^{k_{\ell}}$ when $3 \leq \ell \leq 7$. Finally, we also derive new codes by modifying ${\cal C}[\ell]$ that offer a wider range on blocklength and weight while retaining low complexity for encoding and decoding. For certain selected values of parameters, these modified codes too have an optimal $k$.

A Family of Low-Complexity Binary Codes with Constant Hamming Weights

TL;DR

This work designs binary constant-weight codes with d=2 and M=2^k that admit low-complexity encoding and decoding. It introduces anchor-decodable finite sequences to parameterize a family of codes, notably C[ell], with n=2^ell and w=ell, achieving high combinatorial dimension k_ell and linear encoding alongside poly-log decoding, without binomial coefficient computations. The paper proves optimality properties for the main construction, presents an alternate second construction that achieves the same dimension for small ell, and extends the framework with derived codes that widen weight and blocklength ranges while preserving low complexity. Collectively, these contributions yield practically efficient constant-weight coding schemes with strong theoretical bounds and flexible parameterization for diverse applications.

Abstract

In this paper, we focus on the design of binary constant weight codes that admit low-complexity encoding and decoding algorithms, and that have a size . For every integer , we construct a constant weight code of weight by encoding information in the gaps between successive 's. The code is associated with an integer sequence of length with a constraint defined as {\em anchor-decodability} that ensures low complexity for encoding and decoding. The complexity of the encoding is linear in the input size , and that of the decoding is poly-logarithmic in the input size , discounting the linear time spent on parsing the input. Both the algorithms do not require expensive computation of binomial coefficients, unlike the case in many existing schemes. Among codes generated by all anchor-decodable sequences, we show that has the maximum size with . As is upper bounded by information-theoretically, the code is optimal in its size with respect to two higher order terms of . In particular, meets the upper bound for and one-bit away for . On the other hand, we show that is not unique in attaining by constructing an alternate code again parameterized by an integer with a different low-complexity decoder, yet having the same size when . Finally, we also derive new codes by modifying that offer a wider range on blocklength and weight while retaining low complexity for encoding and decoding. For certain selected values of parameters, these modified codes too have an optimal .
Paper Structure (22 sections, 13 theorems, 70 equations, 3 figures, 1 table, 8 algorithms)

This paper contains 22 sections, 13 theorems, 70 equations, 3 figures, 1 table, 8 algorithms.

Key Result

Proposition 2.1

Let $\ell \geq 3$ be an integer. Suppose $\ell = 2^a+b$ such that $2^a \leq \ell$ is the maximum power of $2$ and $b\geq 0$. Then As a corollory, $k_{\ell} \ \geq \ \ell^2 - \ell \log_2\ell + \log_2 \ell - \ell(1-\tfrac{1}{2\ln 2}) - \tfrac{1}{2\ln 2}$ for every $\ell \geq 3$.

Figures (3)

  • Figure 1: Illustration of the encoding process when $\ell=4$ and the message vector ${\bf{x}}=(1,0,1,0,1,1,1,0,0)$ is encoded into the codeword ${\bf{c}}$ of length $16=2^4$ (represented by the circle) with $c[1]=c[2]={\bf{c}}[10]=c[14]=1$. For decoding, one first determine the anchor (the underlined 1), which is the 1 that has the largest number of consecutive zeros on its left (cyclically), or equivalently, has the largest gap to the nearest 1 on its left. Once the anchor is found, each message block can be recovered by counting the number of 0's between the current 1 to the next.
  • Figure 2: Illustration of the principle of decoding algorithm for $\ell=7, r=2$ when the codeword ${\bf c}$ has $1$'s at $c[j], j =10, 26, 32, 37, 64, 96, 127$ (marked with dots) and $0$'s everywhere else. There are three clock-wise stretches of gaps marked as $\textcircled{a}$, $\textcircled{b}$ and $\textcircled{c}$ that end in a candidate gap, i.e., with value on or above $16$. The stretch $\textcircled{c}$ given by $(5,4,26)$ is unique among these three because in $\textcircled{c}$, every gap value apart from the last one does not qualify as a candidate. The bit $c[64]$ at the end of the stretch $\textcircled{c}$ is therefore picked as the anchor bit.
  • Figure 3: Comparison of $k_{\ell}$ and $\hat{k}_{\ell}$ against the upper bound as $\ell$ varies.

Theorems & Definitions (30)

  • Definition 1
  • Proposition 2.1
  • proof
  • Definition 2
  • Definition 3
  • Definition 4
  • Lemma 2.2
  • proof
  • Theorem 2.3
  • Definition 5
  • ...and 20 more