Table of Contents
Fetching ...

How Many Matrices Should I Prepare To Polarize Channels Optimally Fast?

Hsin-Po Wang, Venkatesan Guruswami

TL;DR

This work analyzes how many kernels are needed to polarize channels optimally fast using dynamic $\\ell\\times\\ell$ kernels, quantifying a trade-off between kernel count and the scaling exponent $\\mu$. By introducing bundles of similar BMS channels and a hitting-set framework, it proves that $m = O(\\ell^{3/\\mu-1})$ kernels suffice to achieve a target $\\mu$, with $m \\approx O(\\sqrt{\\ell})$ when $\\mu \\approx 2$ and $m \\approx 1$ when $\\mu \\approx 3$. The analysis leverages degradation/upgradation concepts, probabilistic bounds on random kernels, and a geometric view of the BMS channel space to connect kernel design, channel similarity, and polarization performance. These results clarify how to balance hardware complexity and polarization speed, recovering known single-kernel advantages in special cases (e.g., BEC) while revealing fundamental limits for general BMS channels.

Abstract

Polar codes that approach capacity at a near-optimal speed, namely with scaling exponents close to $2$, have been shown possible for $q$-ary erasure channels (Pfister and Urbanke), the BEC (Fazeli, Hassani, Mondelli, and Vardy), all BMS channels (Guruswami, Riazanov, and Ye), and all DMCs (Wang and Duursma). There is, nevertheless, a subtlety separating the last two papers from the first two, namely the usage of multiple dynamic kernels in the polarization process, which leads to increased complexity and fewer opportunities to hardware-accelerate. This paper clarifies this subtlety, providing a trade-off between the number of kernels in the construction and the scaling exponent. We show that the number of kernels can be bounded by $O(\ell^{3/μ-1})$ where $μ$ is the targeted scaling exponent and $\ell$ is the kernel size. In particular, if one settles for scaling exponent approaching $3$, a single kernel suffices, and to approach the optimal scaling exponent of $2$, about $O(\sqrt{\ell})$ kernels suffice.

How Many Matrices Should I Prepare To Polarize Channels Optimally Fast?

TL;DR

This work analyzes how many kernels are needed to polarize channels optimally fast using dynamic kernels, quantifying a trade-off between kernel count and the scaling exponent . By introducing bundles of similar BMS channels and a hitting-set framework, it proves that kernels suffice to achieve a target , with when and when . The analysis leverages degradation/upgradation concepts, probabilistic bounds on random kernels, and a geometric view of the BMS channel space to connect kernel design, channel similarity, and polarization performance. These results clarify how to balance hardware complexity and polarization speed, recovering known single-kernel advantages in special cases (e.g., BEC) while revealing fundamental limits for general BMS channels.

Abstract

Polar codes that approach capacity at a near-optimal speed, namely with scaling exponents close to , have been shown possible for -ary erasure channels (Pfister and Urbanke), the BEC (Fazeli, Hassani, Mondelli, and Vardy), all BMS channels (Guruswami, Riazanov, and Ye), and all DMCs (Wang and Duursma). There is, nevertheless, a subtlety separating the last two papers from the first two, namely the usage of multiple dynamic kernels in the polarization process, which leads to increased complexity and fewer opportunities to hardware-accelerate. This paper clarifies this subtlety, providing a trade-off between the number of kernels in the construction and the scaling exponent. We show that the number of kernels can be bounded by where is the targeted scaling exponent and is the kernel size. In particular, if one settles for scaling exponent approaching , a single kernel suffices, and to approach the optimal scaling exponent of , about kernels suffice.
Paper Structure (8 sections, 9 theorems, 27 equations, 2 figures)

This paper contains 8 sections, 9 theorems, 27 equations, 2 figures.

Key Result

Theorem 1

At kernel size $\ell$ and for a targeted scaling exponent $\mu + O(\alpha)$, it suffices to prepare binary matrices to polarize BMS channels.

Figures (2)

  • Figure 1: Left: one-kernel-for-all. Select a fixed kernel $G$ and use it to polarize all channels. Right: dynamic kernels. A "customized" kernel $G(W)$ is selected for each channel $W$. Note that, as the code length increases, more channels are generated and naturally one would guess that more $G$'s are needed. This work focuses on how to reduce the number of $G$'s needed.
  • Figure 2: The quantification of a BMS channel is to "select all tiles that intersect the red, thick path." For any BMS channel $W$, let $\omega$ be the measure on $[0, 1/2]$ as in the BMS decomposition of $W$. The red path is the cdf of $\omega$ with a re-parametrization using $x \coloneqq h_2(p) \coloneqq - p\log_2(p) - (1-p)\log_2(1-p)$. The upgradation $U(W)$, depicted in teal, corresponds to the path along the upper-left boundaries of the gray tiles. The degradation $D(W)$, depicted in brown, corresponds to the path along the lower-right boundaries of the gray tiles. The tile size is set to be $\ell^{-1/\mu}$. Note that $H(W)$ is the area above-left to the red path. Hence $H(D(W)) - H(U(W)) = 2\ell^{-1/\mu} - \ell^{-2/\mu}$.

Theorems & Definitions (13)

  • Theorem 1: main
  • Theorem 2
  • Definition 3: degradation
  • Lemma 4: RiU08
  • Definition 5: BSC decomposition
  • Lemma 6
  • Lemma 7
  • Definition 8: pavement
  • Definition 9: bundle
  • Lemma 10
  • ...and 3 more