Unrolled and Pipelined Decoders based on Look-Up Tables for Polar Codes

Pascal Giard; Syed Aizaz Ali Shah; Alexios Balatsoukas-Stimming; Maximilian Stark; Gerhard Bauch

Unrolled and Pipelined Decoders based on Look-Up Tables for Polar Codes

Pascal Giard, Syed Aizaz Ali Shah, Alexios Balatsoukas-Stimming, Maximilian Stark, Gerhard Bauch

TL;DR

This work investigates LUT-based unrolled and pipelined decoders for polar codes as a means to reduce hardware area while preserving high throughput. By replacing the $f$ and $g$ operations with information-theoretically designed LUT mappings and employing min-sum inspired realizations with input/output relabeling (MS-IB and re-MS-IB), the authors demonstrate three LUT-based variants and compare them to a fixed-point baseline on a short $(N=128, k=64)$ code. The best variant (re-MS-IB) achieves about $23\%$ area reduction and up to $3\%$ faster throughput with comparable error-correction performance, reaching approximately $9.68$ Gbps at $1.51$ GHz in 28 nm FD-SOI, and delivering a substantial improvement in area efficiency. These results indicate that LUT-based, unrolled polar decoders can be competitive, provided LUT design and alphabet relabeling are carefully optimized, offering a path toward ultra-high-throughput, compact decoders for practical systems.

Abstract

Unrolling a decoding algorithm allows to achieve extremely high throughput at the cost of increased area. Look-up tables (LUTs) can be used to replace functions otherwise implemented as circuits. In this work, we show the impact of replacing blocks of logic by carefully crafted LUTs in unrolled decoders for polar codes. We show that using LUTs to improve key performance metrics (e.g., area, throughput, latency) may turn out more challenging than expected. We present three variants of LUT-based decoders and describe their inner workings as well as circuits in detail. The LUT-based decoders are compared against a regular unrolled decoder, employing fixed-point representations for numbers, with a comparable error-correction performance. A short systematic polar code is used as an illustration. All resulting unrolled decoders are shown to be capable of an information throughput of little under 10 Gbps in a 28 nm FD-SOI technology clocked in the vicinity of 1.4 GHz to 1.5 GHz. The best variant of our LUT-based decoders is shown to reduce the area requirements by 23% compared to the regular unrolled decoder while retaining a comparable error-correction performance.

Unrolled and Pipelined Decoders based on Look-Up Tables for Polar Codes

TL;DR

This work investigates LUT-based unrolled and pipelined decoders for polar codes as a means to reduce hardware area while preserving high throughput. By replacing the

and

operations with information-theoretically designed LUT mappings and employing min-sum inspired realizations with input/output relabeling (MS-IB and re-MS-IB), the authors demonstrate three LUT-based variants and compare them to a fixed-point baseline on a short

code. The best variant (re-MS-IB) achieves about

area reduction and up to

faster throughput with comparable error-correction performance, reaching approximately

Gbps at

GHz in 28 nm FD-SOI, and delivering a substantial improvement in area efficiency. These results indicate that LUT-based, unrolled polar decoders can be competitive, provided LUT design and alphabet relabeling are carefully optimized, offering a path toward ultra-high-throughput, compact decoders for practical systems.

Abstract

Paper Structure (15 sections, 7 equations, 6 figures, 2 tables)

This paper contains 15 sections, 7 equations, 6 figures, 2 tables.

Introduction
Background
Encoding of Polar Codes
Successive-Cancellation Decoding and Simplified Successive-Cancellation Decoding
Unrolled and Pipelined Hardware Architectures
Functions as Look-up Tables
Unrolled and Pipelined LUT-based Simplified Successive-Cancellation Decoding
Look-up Table Generation
Hardware-Efficient Look-up Tables
Unrolled-Architecture Generation
Implementation and Results
Functional Blocks in -based Decoders
Error-correction Performance and Impact of Quantization
Comparison of the Unrolled Decoders
Conclusion

Figures (6)

Figure 1: Graph and decoder-tree representations of an $(8,\,5)$ polar code.
Figure 2: Fully-unrolled deeply-pipelined decoder for a systematic (8, 5) polar code. Clock signals omitted for clarity. CC stands for cc.
Figure 3: Setup for generating decoding on a single building block
Figure 4: min-sum for $\mathcal{T}$ in the MS-IB decoder.
Figure 5: min-sum for $\mathcal{T}_{re}$ in the re-MS-IB decoder.
...and 1 more figures

Unrolled and Pipelined Decoders based on Look-Up Tables for Polar Codes

TL;DR

Abstract

Unrolled and Pipelined Decoders based on Look-Up Tables for Polar Codes

Authors

TL;DR

Abstract

Table of Contents

Figures (6)