How Many Matrices Should I Prepare To Polarize Channels Optimally Fast?

Hsin-Po Wang; Venkatesan Guruswami

How Many Matrices Should I Prepare To Polarize Channels Optimally Fast?

Hsin-Po Wang, Venkatesan Guruswami

TL;DR

This work analyzes how many kernels are needed to polarize channels optimally fast using dynamic $\\ell\\times\\ell$ kernels, quantifying a trade-off between kernel count and the scaling exponent $\\mu$. By introducing bundles of similar BMS channels and a hitting-set framework, it proves that $m = O(\\ell^{3/\\mu-1})$ kernels suffice to achieve a target $\\mu$, with $m \\approx O(\\sqrt{\\ell})$ when $\\mu \\approx 2$ and $m \\approx 1$ when $\\mu \\approx 3$. The analysis leverages degradation/upgradation concepts, probabilistic bounds on random kernels, and a geometric view of the BMS channel space to connect kernel design, channel similarity, and polarization performance. These results clarify how to balance hardware complexity and polarization speed, recovering known single-kernel advantages in special cases (e.g., BEC) while revealing fundamental limits for general BMS channels.

Abstract

Polar codes that approach capacity at a near-optimal speed, namely with scaling exponents close to $2$, have been shown possible for $q$-ary erasure channels (Pfister and Urbanke), the BEC (Fazeli, Hassani, Mondelli, and Vardy), all BMS channels (Guruswami, Riazanov, and Ye), and all DMCs (Wang and Duursma). There is, nevertheless, a subtlety separating the last two papers from the first two, namely the usage of multiple dynamic kernels in the polarization process, which leads to increased complexity and fewer opportunities to hardware-accelerate. This paper clarifies this subtlety, providing a trade-off between the number of kernels in the construction and the scaling exponent. We show that the number of kernels can be bounded by $O(\ell^{3/μ-1})$ where $μ$ is the targeted scaling exponent and $\ell$ is the kernel size. In particular, if one settles for scaling exponent approaching $3$, a single kernel suffices, and to approach the optimal scaling exponent of $2$, about $O(\sqrt{\ell})$ kernels suffice.

How Many Matrices Should I Prepare To Polarize Channels Optimally Fast?

TL;DR

This work analyzes how many kernels are needed to polarize channels optimally fast using dynamic

kernels, quantifying a trade-off between kernel count and the scaling exponent

. By introducing bundles of similar BMS channels and a hitting-set framework, it proves that

kernels suffice to achieve a target

, with

when

and

when

. The analysis leverages degradation/upgradation concepts, probabilistic bounds on random kernels, and a geometric view of the BMS channel space to connect kernel design, channel similarity, and polarization performance. These results clarify how to balance hardware complexity and polarization speed, recovering known single-kernel advantages in special cases (e.g., BEC) while revealing fundamental limits for general BMS channels.

Abstract

Polar codes that approach capacity at a near-optimal speed, namely with scaling exponents close to

, have been shown possible for

-ary erasure channels (Pfister and Urbanke), the BEC (Fazeli, Hassani, Mondelli, and Vardy), all BMS channels (Guruswami, Riazanov, and Ye), and all DMCs (Wang and Duursma). There is, nevertheless, a subtlety separating the last two papers from the first two, namely the usage of multiple dynamic kernels in the polarization process, which leads to increased complexity and fewer opportunities to hardware-accelerate. This paper clarifies this subtlety, providing a trade-off between the number of kernels in the construction and the scaling exponent. We show that the number of kernels can be bounded by

where

is the targeted scaling exponent and

is the kernel size. In particular, if one settles for scaling exponent approaching

, a single kernel suffices, and to approach the optimal scaling exponent of

, about

kernels suffice.

Paper Structure (8 sections, 9 theorems, 27 equations, 2 figures)

This paper contains 8 sections, 9 theorems, 27 equations, 2 figures.

Introduction
How Scaling Exponent 2 Was Achieved
Negotiating with the Error Exponent
Bundling Similar Channels
Hitting-Set Problem
Discussion: BMS Channel Geometry
Conclusions
Acknowledgment

Key Result

Theorem 1

At kernel size $\ell$ and for a targeted scaling exponent $\mu + O(\alpha)$, it suffices to prepare binary matrices to polarize BMS channels.

Figures (2)

Figure 1: Left: one-kernel-for-all. Select a fixed kernel $G$ and use it to polarize all channels. Right: dynamic kernels. A "customized" kernel $G(W)$ is selected for each channel $W$. Note that, as the code length increases, more channels are generated and naturally one would guess that more $G$'s are needed. This work focuses on how to reduce the number of $G$'s needed.
Figure 2: The quantification of a BMS channel is to "select all tiles that intersect the red, thick path." For any BMS channel $W$, let $\omega$ be the measure on $[0, 1/2]$ as in the BMS decomposition of $W$. The red path is the cdf of $\omega$ with a re-parametrization using $x \coloneqq h_2(p) \coloneqq - p\log_2(p) - (1-p)\log_2(1-p)$. The upgradation $U(W)$, depicted in teal, corresponds to the path along the upper-left boundaries of the gray tiles. The degradation $D(W)$, depicted in brown, corresponds to the path along the lower-right boundaries of the gray tiles. The tile size is set to be $\ell^{-1/\mu}$. Note that $H(W)$ is the area above-left to the red path. Hence $H(D(W)) - H(U(W)) = 2\ell^{-1/\mu} - \ell^{-2/\mu}$.

Theorems & Definitions (13)

Theorem 1: main
Theorem 2
Definition 3: degradation
Lemma 4: RiU08
Definition 5: BSC decomposition
Lemma 6
Lemma 7
Definition 8: pavement
Definition 9: bundle
Lemma 10
...and 3 more

How Many Matrices Should I Prepare To Polarize Channels Optimally Fast?

TL;DR

Abstract

How Many Matrices Should I Prepare To Polarize Channels Optimally Fast?

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (13)