Unrolled and Pipelined Decoders based on Look-Up Tables for Polar Codes
Pascal Giard, Syed Aizaz Ali Shah, Alexios Balatsoukas-Stimming, Maximilian Stark, Gerhard Bauch
TL;DR
This work investigates LUT-based unrolled and pipelined decoders for polar codes as a means to reduce hardware area while preserving high throughput. By replacing the $f$ and $g$ operations with information-theoretically designed LUT mappings and employing min-sum inspired realizations with input/output relabeling (MS-IB and re-MS-IB), the authors demonstrate three LUT-based variants and compare them to a fixed-point baseline on a short $(N=128, k=64)$ code. The best variant (re-MS-IB) achieves about $23\%$ area reduction and up to $3\%$ faster throughput with comparable error-correction performance, reaching approximately $9.68$ Gbps at $1.51$ GHz in 28 nm FD-SOI, and delivering a substantial improvement in area efficiency. These results indicate that LUT-based, unrolled polar decoders can be competitive, provided LUT design and alphabet relabeling are carefully optimized, offering a path toward ultra-high-throughput, compact decoders for practical systems.
Abstract
Unrolling a decoding algorithm allows to achieve extremely high throughput at the cost of increased area. Look-up tables (LUTs) can be used to replace functions otherwise implemented as circuits. In this work, we show the impact of replacing blocks of logic by carefully crafted LUTs in unrolled decoders for polar codes. We show that using LUTs to improve key performance metrics (e.g., area, throughput, latency) may turn out more challenging than expected. We present three variants of LUT-based decoders and describe their inner workings as well as circuits in detail. The LUT-based decoders are compared against a regular unrolled decoder, employing fixed-point representations for numbers, with a comparable error-correction performance. A short systematic polar code is used as an illustration. All resulting unrolled decoders are shown to be capable of an information throughput of little under 10 Gbps in a 28 nm FD-SOI technology clocked in the vicinity of 1.4 GHz to 1.5 GHz. The best variant of our LUT-based decoders is shown to reduce the area requirements by 23% compared to the regular unrolled decoder while retaining a comparable error-correction performance.
