Table of Contents
Fetching ...

CrossMPT: Cross-attention Message-Passing Transformer for Error Correcting Codes

Seong-Joon Park, Hee-Youl Kwak, Sang-Hyo Kim, Yongjune Kim, Jong-Seon No

TL;DR

CrossMPT tackles ECC decoding by separating magnitude reliabilities and syndrome information and updating them via two masked cross-attention blocks guided by the parity-check matrix. This cross-attention message-passing design emulates classic BP-style information exchange while leveraging transformer architectures. Empirical results show CrossMPT surpasses ECCT and BP-based neural decoders across BCH, polar, LDPC, and turbo codes, with substantial reductions in memory usage, FLOPs, and training/inference time, and strong performance on long codes. The work notably brings transformer-based ECC decoders closer to ML performance on short codes and extends their practical scalability to longer codes.

Abstract

Error correcting codes (ECCs) are indispensable for reliable transmission in communication systems. The recent advancements in deep learning have catalyzed the exploration of ECC decoders based on neural networks. Among these, transformer-based neural decoders have achieved state-of-the-art decoding performance. In this paper, we propose a novel Cross-attention Message-Passing Transformer (CrossMPT), which shares key operational principles with conventional message-passing decoders. While conventional transformer-based decoders employ self-attention mechanism without distinguishing between the types of input vectors (i.e., magnitude and syndrome vectors), CrossMPT updates the two types of input vectors separately and iteratively using two masked cross-attention blocks. The mask matrices are determined by the code's parity-check matrix, which explicitly captures the irrelevant relationship between two input vectors. Our experimental results show that CrossMPT significantly outperforms existing neural network-based decoders for various code classes. Notably, CrossMPT achieves this decoding performance improvement, while significantly reducing the memory usage, complexity, inference time, and training time.

CrossMPT: Cross-attention Message-Passing Transformer for Error Correcting Codes

TL;DR

CrossMPT tackles ECC decoding by separating magnitude reliabilities and syndrome information and updating them via two masked cross-attention blocks guided by the parity-check matrix. This cross-attention message-passing design emulates classic BP-style information exchange while leveraging transformer architectures. Empirical results show CrossMPT surpasses ECCT and BP-based neural decoders across BCH, polar, LDPC, and turbo codes, with substantial reductions in memory usage, FLOPs, and training/inference time, and strong performance on long codes. The work notably brings transformer-based ECC decoders closer to ML performance on short codes and extends their practical scalability to longer codes.

Abstract

Error correcting codes (ECCs) are indispensable for reliable transmission in communication systems. The recent advancements in deep learning have catalyzed the exploration of ECC decoders based on neural networks. Among these, transformer-based neural decoders have achieved state-of-the-art decoding performance. In this paper, we propose a novel Cross-attention Message-Passing Transformer (CrossMPT), which shares key operational principles with conventional message-passing decoders. While conventional transformer-based decoders employ self-attention mechanism without distinguishing between the types of input vectors (i.e., magnitude and syndrome vectors), CrossMPT updates the two types of input vectors separately and iteratively using two masked cross-attention blocks. The mask matrices are determined by the code's parity-check matrix, which explicitly captures the irrelevant relationship between two input vectors. Our experimental results show that CrossMPT significantly outperforms existing neural network-based decoders for various code classes. Notably, CrossMPT achieves this decoding performance improvement, while significantly reducing the memory usage, complexity, inference time, and training time.
Paper Structure (26 sections, 8 equations, 13 figures, 5 tables)

This paper contains 26 sections, 8 equations, 13 figures, 5 tables.

Figures (13)

  • Figure 1: The PCM and the mask matrices of ECCT and CrossMPT
  • Figure 2: Architecture of CrossMPT.
  • Figure 3: Conceptual comparison of the sum-product message-passing algorithm and the proposed cross-attention (CA) message-passing algorithm.
  • Figure 4: The BER performance of various decoders (BP, Hyp BP, AR BP, ECCT) and CrossMPT.
  • Figure 5: The average attention scores of all $N=6$ layers for ECCT and CrossMPT.
  • ...and 8 more figures