Table of Contents
Fetching ...

PolarZero: A Reinforcement Learning Approach for Low-Complexity Polarization Kernel Design

Yi-Ting Hong, Stefano Rini, Luca Barletta

TL;DR

This work tackles designing large-polarization kernels with low decoding complexity by framing kernel search as a reinforcement learning problem. It introduces PolarZero, which leverages a Gumbel AlphaZero framework to construct kernel matrices that satisfy a target partial distance profile while minimizing RMLD complexity, including innovations like randomized initialization and multi-size training. Empirically, PolarZero discovers a 16×16 kernel with about 17% lower decoding cost than handcrafted designs at an exponent of approximately $0.5183$, outperforming Arıkan’s kernel ($E=0.5$). The results demonstrate a practical, data-driven pathway to tailor polar codes for hardware-friendly decoding, latency, and memory constraints, with broader implications for automated code design under implementation constraints.

Abstract

Polar codes with large kernels can achieve improved error exponents but are challenging to design with low decoding complexity. This work investigates kernel construction under recursive maximum likelihood decoding (RMLD) using a reinforcement learning framework based on the Gumbel AlphaZero algorithm. The proposed method efficiently explores the design space and identifies large-size kernels that satisfy a given error exponent while minimizing decoding complexity. For a size-16 kernel, it achieves 17% lower decoding complexity than handcrafted designs while reaching an error exponent of 0.5183 compared to 0.5 for Arikan's kernel, demonstrating the effectiveness of the learning-based approach for practical polar code construction.

PolarZero: A Reinforcement Learning Approach for Low-Complexity Polarization Kernel Design

TL;DR

This work tackles designing large-polarization kernels with low decoding complexity by framing kernel search as a reinforcement learning problem. It introduces PolarZero, which leverages a Gumbel AlphaZero framework to construct kernel matrices that satisfy a target partial distance profile while minimizing RMLD complexity, including innovations like randomized initialization and multi-size training. Empirically, PolarZero discovers a 16×16 kernel with about 17% lower decoding cost than handcrafted designs at an exponent of approximately , outperforming Arıkan’s kernel (). The results demonstrate a practical, data-driven pathway to tailor polar codes for hardware-friendly decoding, latency, and memory constraints, with broader implications for automated code design under implementation constraints.

Abstract

Polar codes with large kernels can achieve improved error exponents but are challenging to design with low decoding complexity. This work investigates kernel construction under recursive maximum likelihood decoding (RMLD) using a reinforcement learning framework based on the Gumbel AlphaZero algorithm. The proposed method efficiently explores the design space and identifies large-size kernels that satisfy a given error exponent while minimizing decoding complexity. For a size-16 kernel, it achieves 17% lower decoding complexity than handcrafted designs while reaching an error exponent of 0.5183 compared to 0.5 for Arikan's kernel, demonstrating the effectiveness of the learning-based approach for practical polar code construction.

Paper Structure

This paper contains 50 sections, 84 equations, 13 figures, 13 tables, 1 algorithm.

Figures (13)

  • Figure 1: A graphical representation of the proposed reinforcement learning algorithm for the design of polarization kernels.In the self-play phase, the agent interacts with the environment and generates self-play data consists of a series of states, actions, rewards. In the training phase, we train the agent using the self-play data obtained from self-play phase to improve the agent policy.
  • Figure 2: SC decoding of Arıkan’s kernel $F_2$: LLR transformation and hard decisions.
  • Figure 3: Overview of the generalized SC decoding for a size-$\ell$ kernel $G_\ell$ using the RMLD algorithm. At phase $i$, an extended kernel $G_\ell^{(i)}$ is constructed to form an RMLD tree, denoted by $\text{Tree}_i$, following Steps 1-3 in Sec. \ref{['Sec:RMLD_steps_']}. During decoding, given the received LLRs $L^{\ell-1}_0$ and the prior hard decisions $\hat{u}_0^{i-1}$ at phase $i$, a soft LLR output $\hat{L}_i$ is computed using $\text{Tree}_i$ following Step 4. The corresponding hard decision $\hat{u}_i$ is then obtained from $\hat{L}_i$ according to \ref{['eq:hard_decision']}. The decoding process proceeds sequentially from phase $0$ to $\ell-1$, ultimately producing the decoded sequence $\hat{u}_0^{\ell-1}$.
  • Figure 4: RMLD max tree examples.
  • Figure 5: Decoding process of the RMLD tree for the size-$4$ kernel $G_4$: $T_{xy}$ is obtained by combining $T_{xz}$ and $T_{zy}$.
  • ...and 8 more figures

Theorems & Definitions (7)

  • Example 1
  • Example 2
  • Example 3
  • Example 4
  • Example 5
  • Example 6
  • Example 7