Policy-Guided MCTS for near Maximum-Likelihood Decoding of Short Codes
Y. Tian, C. Yue, P. Cheng, G. Pang, B. Vucetic, Y. Li
TL;DR
This work tackles the challenge of achieving near-MLD decoding for short block codes without the Gaussian-elimination overhead of traditional OSD. It introduces a policy-guided Monte Carlo Tree Search that builds and traverses a deterministic TEP tree, with a neural policy trained via MCTS to steer search toward the correct test error patterns. Training uses self-generated supervision from near-MLD outcomes, enabling the policy to generalize across noisy realizations. Experiments on the $(32,16)$ eBCH and $(48,24)$ QR codes show substantial reductions in TEP searches and decoding latency at high SNRs, with near-MLD BLER achieved at practical orders, and the framework generalizes to other OSD variants and stopping rules.
Abstract
In this paper, we propose a policy-guided Monte Carlo Tree Search (MCTS) decoder that achieves near maximum-likelihood decoding (MLD) performance for short block codes. The MCTS decoder searches for test error patterns (TEPs) in the received information bits and obtains codeword candidates through re-encoding. The TEP search is executed on a tree structure, guided by a neural network policy trained via MCTS-based learning. The trained policy guides the decoder to find the correct TEPs with minimal steps from the root node (all-zero TEP). The decoder outputs the codeword with maximum likelihood when the early stopping criterion is satisfied. The proposed method requires no Gaussian elimination (GE) compared to ordered statistics decoding (OSD) and can reduce search complexity by 95\% compared to non-GE OSD. It achieves lower decoding latency than both OSD and non-GE OSD at high SNRs.
