Generalization Bounds for Neural Belief Propagation Decoders
Sudarshan Adiga, Xin Xiao, Ravi Tandon, Bane Vasic, Tamal Bose
TL;DR
This work tackles the lack of theoretical generalization guarantees for Neural Belief Propagation (NBP) decoders used on LDPC-type codes. It develops a PAC-learning framework based on bit-wise Rademacher complexity and covering numbers to bound the generalization gap in BER as a function of training set size $m$, decoding iterations $T$, and code parameters $(n,k,d_v,d_c)$, yielding an explicit bound that scales with these quantities. The analysis extends to irregular parity-check matrices and incorporates channel SNR considerations via a bound on input LLRs, including a unbounded-LLR treatment. Experimental results on Tanner and QC-LDPC codes corroborate the theory, showing the generalization gap decreases with $m$ and increases with $T$ and blocklength, thereby providing practical guidance for dataset and code design in ML-based decoders.
Abstract
Machine learning based approaches are being increasingly used for designing decoders for next generation communication systems. One widely used framework is neural belief propagation (NBP), which unfolds the belief propagation (BP) iterations into a deep neural network and the parameters are trained in a data-driven manner. NBP decoders have been shown to improve upon classical decoding algorithms. In this paper, we investigate the generalization capabilities of NBP decoders. Specifically, the generalization gap of a decoder is the difference between empirical and expected bit-error-rate(s). We present new theoretical results which bound this gap and show the dependence on the decoder complexity, in terms of code parameters (blocklength, message length, variable/check node degrees), decoding iterations, and the training dataset size. Results are presented for both regular and irregular parity-check matrices. To the best of our knowledge, this is the first set of theoretical results on generalization performance of neural network based decoders. We present experimental results to show the dependence of generalization gap on the training dataset size, and decoding iterations for different codes.
