CrossMPT: Cross-attention Message-Passing Transformer for Error Correcting Codes
Seong-Joon Park, Hee-Youl Kwak, Sang-Hyo Kim, Yongjune Kim, Jong-Seon No
TL;DR
CrossMPT tackles ECC decoding by separating magnitude reliabilities and syndrome information and updating them via two masked cross-attention blocks guided by the parity-check matrix. This cross-attention message-passing design emulates classic BP-style information exchange while leveraging transformer architectures. Empirical results show CrossMPT surpasses ECCT and BP-based neural decoders across BCH, polar, LDPC, and turbo codes, with substantial reductions in memory usage, FLOPs, and training/inference time, and strong performance on long codes. The work notably brings transformer-based ECC decoders closer to ML performance on short codes and extends their practical scalability to longer codes.
Abstract
Error correcting codes (ECCs) are indispensable for reliable transmission in communication systems. The recent advancements in deep learning have catalyzed the exploration of ECC decoders based on neural networks. Among these, transformer-based neural decoders have achieved state-of-the-art decoding performance. In this paper, we propose a novel Cross-attention Message-Passing Transformer (CrossMPT), which shares key operational principles with conventional message-passing decoders. While conventional transformer-based decoders employ self-attention mechanism without distinguishing between the types of input vectors (i.e., magnitude and syndrome vectors), CrossMPT updates the two types of input vectors separately and iteratively using two masked cross-attention blocks. The mask matrices are determined by the code's parity-check matrix, which explicitly captures the irrelevant relationship between two input vectors. Our experimental results show that CrossMPT significantly outperforms existing neural network-based decoders for various code classes. Notably, CrossMPT achieves this decoding performance improvement, while significantly reducing the memory usage, complexity, inference time, and training time.
