On the Optimality of Single-label and Multi-label Neural Network Decoders
Yunus Can Gültekin, Péter Scheepers, Yuncheng Yuan, Federico Corradi, Alex Alvarado
TL;DR
This work addresses whether SLNN and MLNN decoders can achieve maximum likelihood decoding with simple architectures. It provides constructive proofs (Theorems 1 and 2) of optimal, training-free codebook-based NN designs and validates them on short codes such as the Hamming $(n,k)=(7,4)$, polar $(16,8)$, and BCH $(31,21)$, illustrating codeword-wise and bit-wise ML performance. The key contributions are explicit binary-weight architectures that realize ML decoding with substantially reduced complexity compared to training-based decoders, and a scalability analysis highlighting the curse of dimensionality for longer codes. The results imply that for moderate blocklengths, ML decoding can be achieved with trivial, sparse NNs, guiding decoder design while clarifying practical limits for larger codes.
Abstract
We investigate the design of two neural network (NN) architectures recently proposed as decoders for forward error correction: the so-called single-label NN (SLNN) and multi-label NN (MLNN) decoders. These decoders have been reported to achieve near-optimal codeword- and bit-wise performance, respectively. Results in the literature show near-optimality for a variety of short codes. In this paper, we analytically prove that certain SLNN and MLNN architectures can, in fact, always realize optimal decoding, regardless of the code. These optimal architectures and their binary weights are shown to be defined by the codebook, i.e., no training or network optimization is required. Our proposed architectures are in fact not NNs, but a different way of implementing the maximum likelihood decoding rule. Optimal performance is numerically demonstrated for Hamming $(7,4)$, Polar $(16,8)$, and BCH $(31,21)$ codes. The results show that our optimal architectures are less complex than the SLNN and MLNN architectures proposed in the literature, which in fact only achieve near-optimal performance. Extension to longer codes is still hindered by the curse of dimensionality. Therefore, even though SLNN and MLNN can perform maximum likelihood decoding, such architectures cannot be used for medium and long codes.
