Table of Contents
Fetching ...

Highly Efficient Parallel Row-Layered Min-Sum MDPC Decoder for McEliece Cryptosystem

Jiaxuan Cai, Xinmiao Zhang

TL;DR

This work tackles memory and latency bottlenecks in Min-sum MDPC decoding for McEliece post-quantum cryptography by combining row-layered scheduling with finite-precision mitigation and a dynamic, L-parallel decoder architecture. It introduces construction constraints on the MDPC H matrix to enable efficient L×L parallelism while preserving decoding performance and cryptographic security, and a dynamic H-division scheme to realize near-linear speedups with small additional memory. The proposed 2-parallel design achieves about 26% memory reduction and around 70% latency reduction compared with prior Sliced-MP approaches, with minimal impact on FER for moderate L and strong resistance to reaction attacks. Overall, the paper demonstrates a practical path to high-parallel, memory-efficient MDPC decoders suitable for PQC standardization and real-world deployment.

Abstract

The medium-density parity-check (MDPC) code-based McEliece cryptosystem remains a finalist of the post-quantum cryptography standard. The Min-sum decoding algorithm achieves better performance-complexity tradeoff than other algorithms for MDPC codes. However, the prior Min-sum MDPC decoder requires large memories, whose complexity dominates the overall complexity. Besides, its actual achievable parallelism is limited. This paper has four contributions: For the first time, the row-layered scheduling scheme is exploited to substantially reduce the memory requirement of MDPC decoders; A low-complexity scheme is developed to mitigate the performance loss caused by finite precision representation of the messages and high column weights of MDPC codes in row-layered decoding; Constraints are added to the parity check matrix construction to enable effective parallel processing with negligible impacts on the decoder performance and resilience towards attacks; A novel parity check matrix division scheme for highly efficient parallel processing is proposed and the corresponding parallel row-layered decoder architecture is designed. The number of clock cycles for each decoding iteration is reduced by a factor of L using the proposed L-parallel decoder with very small memory overhead. For an example 2-parallel decoder, the proposed design leads to 26% less memory requirement and 70% latency reduction compared to the prior decoder.

Highly Efficient Parallel Row-Layered Min-Sum MDPC Decoder for McEliece Cryptosystem

TL;DR

This work tackles memory and latency bottlenecks in Min-sum MDPC decoding for McEliece post-quantum cryptography by combining row-layered scheduling with finite-precision mitigation and a dynamic, L-parallel decoder architecture. It introduces construction constraints on the MDPC H matrix to enable efficient L×L parallelism while preserving decoding performance and cryptographic security, and a dynamic H-division scheme to realize near-linear speedups with small additional memory. The proposed 2-parallel design achieves about 26% memory reduction and around 70% latency reduction compared with prior Sliced-MP approaches, with minimal impact on FER for moderate L and strong resistance to reaction attacks. Overall, the paper demonstrates a practical path to high-parallel, memory-efficient MDPC decoders suitable for PQC standardization and real-world deployment.

Abstract

The medium-density parity-check (MDPC) code-based McEliece cryptosystem remains a finalist of the post-quantum cryptography standard. The Min-sum decoding algorithm achieves better performance-complexity tradeoff than other algorithms for MDPC codes. However, the prior Min-sum MDPC decoder requires large memories, whose complexity dominates the overall complexity. Besides, its actual achievable parallelism is limited. This paper has four contributions: For the first time, the row-layered scheduling scheme is exploited to substantially reduce the memory requirement of MDPC decoders; A low-complexity scheme is developed to mitigate the performance loss caused by finite precision representation of the messages and high column weights of MDPC codes in row-layered decoding; Constraints are added to the parity check matrix construction to enable effective parallel processing with negligible impacts on the decoder performance and resilience towards attacks; A novel parity check matrix division scheme for highly efficient parallel processing is proposed and the corresponding parallel row-layered decoder architecture is designed. The number of clock cycles for each decoding iteration is reduced by a factor of L using the proposed L-parallel decoder with very small memory overhead. For an example 2-parallel decoder, the proposed design leads to 26% less memory requirement and 70% latency reduction compared to the prior decoder.
Paper Structure (17 sections, 2 equations, 13 figures, 6 tables, 1 algorithm)

This paper contains 17 sections, 2 equations, 13 figures, 6 tables, 1 algorithm.

Figures (13)

  • Figure 1: The $\mathbf H$ matrix and the corresponding Tanner graph of a toy MDPC code.
  • Figure 2: The encryption and decryption process of the McEliece cryptosystem based on QC-MDPC codes.
  • Figure 3: (a) Decoding FER and (b) average number of decoding iterations of sliced message-passing Min-sum, row-layered Min-sum, and REMP-2 BF decoding for an MDPC code with $(n_0,r,w)=(2, 4801,45)$ and $I_{\text{max}}=30$.
  • Figure 4: Example parity check matrices of (a) an MDPC code used in the McEliece cryptosystem; (b) an LDPC code used for error correction in digital communication and storage systems.
  • Figure 5: Top-level architecture of row-layered Min-sum MDPC decoder.
  • ...and 8 more figures