Learning Linear Block Error Correction Codes

Yoni Choukroun; Lior Wolf

Learning Linear Block Error Correction Codes

Yoni Choukroun, Lior Wolf

TL;DR

This work tackles the challenge of jointly designing binary linear block codes and their neural decoders for short block lengths by proposing an end-to-end framework that optimizes both the code and a Transformer-based decoder in a differentiable fashion over $GF(2)$. It introduces a differentiable encoding approach with polarization-based binarization and a differentiable parity-check–driven masking for attention, enabling gradient flow through the entire encoding–decoding pipeline. Empirically, the learned codes outperform conventional codes and prior neural decoders, and also deliver improved performance when used with traditional decoders, suggesting a broadly useful approach to code design. The proposed method has practical significance for efficient ECC on edge devices and points toward new families of codes obtained via joint optimization with their decoders.

Abstract

Error correction codes are a crucial part of the physical communication layer, ensuring the reliable transfer of data over noisy channels. The design of optimal linear block codes capable of being efficiently decoded is of major concern, especially for short block lengths. While neural decoders have recently demonstrated their advantage over classical decoding techniques, the neural design of the codes remains a challenge. In this work, we propose for the first time a unified encoder-decoder training of binary linear block codes. To this end, we adapt the coding setting to support efficient and differentiable training of the code for end-to-end optimization over the order two Galois field. We also propose a novel Transformer model in which the self-attention masking is performed in a differentiable fashion for the efficient backpropagation of the code gradient. Our results show that (i) the proposed decoder outperforms existing neural decoding on conventional codes, (ii) the suggested framework generates codes that outperform the {analogous} conventional codes, and (iii) the codes we developed not only excel with our decoder but also show enhanced performance with traditional decoding techniques.

Learning Linear Block Error Correction Codes

TL;DR

. It introduces a differentiable encoding approach with polarization-based binarization and a differentiable parity-check–driven masking for attention, enabling gradient flow through the entire encoding–decoding pipeline. Empirically, the learned codes outperform conventional codes and prior neural decoders, and also deliver improved performance when used with traditional decoders, suggesting a broadly useful approach to code design. The proposed method has practical significance for efficient ECC on edge devices and points toward new families of codes obtained via joint optimization with their decoders.

Abstract

Paper Structure (25 sections, 12 equations, 9 figures, 5 tables)

This paper contains 25 sections, 12 equations, 9 figures, 5 tables.

Introduction
Related Works
Background
Coding
Transformers for Error Correction Code
Method
End-to-End Optimization
Optimization over $GF(2)$
Differentiable Masking
Architecture
Training
Experiments
Ablation Study and Analysis
Performance with Belief Propagation
Parity-check Matrix Visualization
...and 10 more sections

Figures (9)

Figure 1: Illustration of the proposed end-to-end communication system. Our work focuses on the unified design and co-training of the code induced by $\Omega$ and of the parameterized decoder $f_{\theta}$.
Figure 2: For the Hamming(7,4) code: (a) the Tanner graph, (b) the proposed differentiable masking for the standardized version of the code.
Figure 3: Illustration of the proposed architecture. The main contributions are represented with dashed lines.
Figure 4: For a $N=2$ layers DC-ECCT (first and second row) and a (31,16) code: (a) self-attention layer, (b) connectivity mapping $\psi_{\gamma}$, (c) the corresponding filtered mask $\psi_{\gamma}(g(H_{\Omega}))$ (d) the obtained soft masked self-attention. The self-attention maps have been averaged over the heads dimension.
Figure 5: The original parity-check matrix (PCM) of (a) BCH(31,16), (d) POLAR(32,11) and their standard form in (b) and (e), respectively. The third column corresponds to the learned parity-check matrices of the corresponding code length and rate. The PCM sparsity is of (a) $25\%$, (b) $30\%$ (c) $16\%$, (d) $31\%$, (e) $17\%$, and (f) $15\%$.
...and 4 more figures

Learning Linear Block Error Correction Codes

TL;DR

Abstract

Learning Linear Block Error Correction Codes

Authors

TL;DR

Abstract

Table of Contents

Figures (9)