Table of Contents
Fetching ...

Unrolled and Pipelined Decoders based on Look-Up Tables for Polar Codes

Pascal Giard, Syed Aizaz Ali Shah, Alexios Balatsoukas-Stimming, Maximilian Stark, Gerhard Bauch

TL;DR

This work investigates LUT-based unrolled and pipelined decoders for polar codes as a means to reduce hardware area while preserving high throughput. By replacing the $f$ and $g$ operations with information-theoretically designed LUT mappings and employing min-sum inspired realizations with input/output relabeling (MS-IB and re-MS-IB), the authors demonstrate three LUT-based variants and compare them to a fixed-point baseline on a short $(N=128, k=64)$ code. The best variant (re-MS-IB) achieves about $23\%$ area reduction and up to $3\%$ faster throughput with comparable error-correction performance, reaching approximately $9.68$ Gbps at $1.51$ GHz in 28 nm FD-SOI, and delivering a substantial improvement in area efficiency. These results indicate that LUT-based, unrolled polar decoders can be competitive, provided LUT design and alphabet relabeling are carefully optimized, offering a path toward ultra-high-throughput, compact decoders for practical systems.

Abstract

Unrolling a decoding algorithm allows to achieve extremely high throughput at the cost of increased area. Look-up tables (LUTs) can be used to replace functions otherwise implemented as circuits. In this work, we show the impact of replacing blocks of logic by carefully crafted LUTs in unrolled decoders for polar codes. We show that using LUTs to improve key performance metrics (e.g., area, throughput, latency) may turn out more challenging than expected. We present three variants of LUT-based decoders and describe their inner workings as well as circuits in detail. The LUT-based decoders are compared against a regular unrolled decoder, employing fixed-point representations for numbers, with a comparable error-correction performance. A short systematic polar code is used as an illustration. All resulting unrolled decoders are shown to be capable of an information throughput of little under 10 Gbps in a 28 nm FD-SOI technology clocked in the vicinity of 1.4 GHz to 1.5 GHz. The best variant of our LUT-based decoders is shown to reduce the area requirements by 23% compared to the regular unrolled decoder while retaining a comparable error-correction performance.

Unrolled and Pipelined Decoders based on Look-Up Tables for Polar Codes

TL;DR

This work investigates LUT-based unrolled and pipelined decoders for polar codes as a means to reduce hardware area while preserving high throughput. By replacing the and operations with information-theoretically designed LUT mappings and employing min-sum inspired realizations with input/output relabeling (MS-IB and re-MS-IB), the authors demonstrate three LUT-based variants and compare them to a fixed-point baseline on a short code. The best variant (re-MS-IB) achieves about area reduction and up to faster throughput with comparable error-correction performance, reaching approximately Gbps at GHz in 28 nm FD-SOI, and delivering a substantial improvement in area efficiency. These results indicate that LUT-based, unrolled polar decoders can be competitive, provided LUT design and alphabet relabeling are carefully optimized, offering a path toward ultra-high-throughput, compact decoders for practical systems.

Abstract

Unrolling a decoding algorithm allows to achieve extremely high throughput at the cost of increased area. Look-up tables (LUTs) can be used to replace functions otherwise implemented as circuits. In this work, we show the impact of replacing blocks of logic by carefully crafted LUTs in unrolled decoders for polar codes. We show that using LUTs to improve key performance metrics (e.g., area, throughput, latency) may turn out more challenging than expected. We present three variants of LUT-based decoders and describe their inner workings as well as circuits in detail. The LUT-based decoders are compared against a regular unrolled decoder, employing fixed-point representations for numbers, with a comparable error-correction performance. A short systematic polar code is used as an illustration. All resulting unrolled decoders are shown to be capable of an information throughput of little under 10 Gbps in a 28 nm FD-SOI technology clocked in the vicinity of 1.4 GHz to 1.5 GHz. The best variant of our LUT-based decoders is shown to reduce the area requirements by 23% compared to the regular unrolled decoder while retaining a comparable error-correction performance.
Paper Structure (15 sections, 7 equations, 6 figures, 2 tables)

This paper contains 15 sections, 7 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Graph and decoder-tree representations of an $(8,\,5)$ polar code.
  • Figure 2: Fully-unrolled deeply-pipelined decoder for a systematic (8, 5) polar code. Clock signals omitted for clarity. CC stands for cc.
  • Figure 3: Setup for generating decoding on a single building block
  • Figure 4: min-sum for $\mathcal{T}$ in the MS-IB decoder.
  • Figure 5: min-sum for $\mathcal{T}_{re}$ in the re-MS-IB decoder.
  • ...and 1 more figures