A Node-Based Polar List Decoder with Frame Interleaving and Ensemble Decoding Support
Yuqing Ren, Leyu Zhang, Ludovic Damien Blanc, Yifei Shen, Xinwei Li, Alexios Balatsoukas-Stimming, Chuan Zhang, Andreas Burg
TL;DR
This work tackles latency and efficiency bottlenecks in node-based SCL polar decoders by introducing a frame-interleaving architecture that time-shares SCU and NPU to decode two frames concurrently. It augments the design with two dynamic stall-reduction strategies (S1 and S2) and an online instruction generator for SR/basic nodes, enabling rate-flexible operation without offline instruction storage. Additionally, graph-ensemble decoding via permuted factor graphs is integrated in Modes II/III to boost error-correcting performance with modest throughput trade-offs. The proposed 28nm FD-SOI ASIC achieves $3.34$ Gbps throughput at $692$ MHz for UL-$(1024,512)$ with an area efficiency of $2.62$ Gbps/mm$^2$, and demonstrates significant improvements over baselines in both throughput and energy efficiency while supporting all 5G NR polar codes. Overall, the framework delivers a flexible, high-throughput, area- and energy-efficient solution for low-latency polar decoding in 5G contexts, including frame and graph interleaving, SR-node processing, and online instruction generation.
Abstract
Node-based successive cancellation list (SCL) decoding has received considerable attention in wireless communications for its significant reduction in decoding latency, particularly with 5G New Radio (NR) polar codes. However, the existing node-based SCL decoders are constrained by sequential processing, leading to complicated and data-dependent computational units that introduce unavoidable stalls, reducing hardware efficiency. In this paper, we present a frame-interleaving hardware architecture for a generalized node-based SCL decoder. By efficiently reusing otherwise idle computational units, two independent frames can be decoded simultaneously, resulting in a significant throughput gain. Based on this new architecture, we further exploit graph ensembles to diversify the decoding space, thus enhancing the error-correcting performance with a limited list size. Two dynamic strategies are proposed to eliminate the residual stalls in the decoding schedule, which eventually results in nearly 2x throughput compared to the state-of-the-art baseline node-based SCL decoder. To impart the decoder rate flexibility, we develop a novel online instruction generator to identify the generalized nodes and produce instructions on-the-fly. The corresponding 28nm FD-SOI ASIC SCL decoder with a list size of 8 has a core area of 1.28 mm2 and operates at 692 MHz. It is compatible with all 5G NR polar codes and achieves a throughput of 3.34 Gbps and an area efficiency of 2.62 Gbps/mm2 for uplink (1024, 512) codes, which is 1.41x and 1.69x better than the state-of-the-art node-based SCL decoders.
