Table of Contents
Fetching ...

Soft Graph Transformer for MIMO Detection

Jiadong Hong, Lei Liu, Xinyu Bian, Wenjie Wang, Zhaoyang Zhang

TL;DR

This work addresses the challenge of efficient MIMO symbol detection, where Maximum Likelihood (ML) detection is intractable and conventional AMP-based methods can falter in finite dimensions. The authors introduce the Soft Graph Transformer (SGT), which combines self-attention to model contextual dependencies within symbol and constraint subgraphs and graph-aware cross-attention to pass messages across the two isomorphic subgraphs, all within an AMP-inspired iterative framework. A soft-input-soft-output interface enables integration of priors from other blocks, producing refined soft outputs suitable for iterative decoding while preserving computational efficiency. Empirical results show near-ML performance in small MIMO and clear advantages over existing Transformer-based and deep-unfolded detectors, with favorable scalability in larger systems for practical receiver design.

Abstract

We propose the Soft Graph Transformer (SGT), a soft-input-soft-output neural architecture designed for MIMO detection. While Maximum Likelihood (ML) detection achieves optimal accuracy, its exponential complexity makes it infeasible in large systems, and conventional message-passing algorithms rely on asymptotic assumptions that often fail in finite dimensions. Recent Transformer-based detectors show strong performance but typically overlook the MIMO factor graph structure and cannot exploit prior soft information. SGT addresses these limitations by combining self-attention, which encodes contextual dependencies within symbol and constraint subgraphs, with graph-aware cross-attention, which performs structured message passing across subgraphs. Its soft-input interface allows the integration of auxiliary priors, producing effective soft outputs while maintaining computational efficiency. Experiments demonstrate that SGT achieves near-ML performance and offers a flexible and interpretable framework for receiver systems that leverage soft priors.

Soft Graph Transformer for MIMO Detection

TL;DR

This work addresses the challenge of efficient MIMO symbol detection, where Maximum Likelihood (ML) detection is intractable and conventional AMP-based methods can falter in finite dimensions. The authors introduce the Soft Graph Transformer (SGT), which combines self-attention to model contextual dependencies within symbol and constraint subgraphs and graph-aware cross-attention to pass messages across the two isomorphic subgraphs, all within an AMP-inspired iterative framework. A soft-input-soft-output interface enables integration of priors from other blocks, producing refined soft outputs suitable for iterative decoding while preserving computational efficiency. Empirical results show near-ML performance in small MIMO and clear advantages over existing Transformer-based and deep-unfolded detectors, with favorable scalability in larger systems for practical receiver design.

Abstract

We propose the Soft Graph Transformer (SGT), a soft-input-soft-output neural architecture designed for MIMO detection. While Maximum Likelihood (ML) detection achieves optimal accuracy, its exponential complexity makes it infeasible in large systems, and conventional message-passing algorithms rely on asymptotic assumptions that often fail in finite dimensions. Recent Transformer-based detectors show strong performance but typically overlook the MIMO factor graph structure and cannot exploit prior soft information. SGT addresses these limitations by combining self-attention, which encodes contextual dependencies within symbol and constraint subgraphs, with graph-aware cross-attention, which performs structured message passing across subgraphs. Its soft-input interface allows the integration of auxiliary priors, producing effective soft outputs while maintaining computational efficiency. Experiments demonstrate that SGT achieves near-ML performance and offers a flexible and interpretable framework for receiver systems that leverage soft priors.

Paper Structure

This paper contains 11 sections, 4 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: (a) Factor graph representation of the MIMO system, where $\mathbf{h}_j$ is the $j$-th row of the channel matrix $\mathbf{H}$, describing the connection between transmit symbols $\mathbf{x}$ and received signals $y_j$. (b) Unfolded SGT-based message passing (MP) detector with $L$ layers, where each SGT layer corresponds to one MP iteration. (c) Internal structure of one SGT layer, consisting of self-attention, cross-attention, and FFN modules for embedding, positional encoding, and LLR output. $N_\text{bits}$ in (c) refers to the number of bits per constellation symbol.
  • Figure 2: Comparison between CrossMPT and SGT message passing architectures. For decoding, constraints are defined between heterogeneous nodes (variable and check nodes) via the parity-check matrix $\mathbf{H}$, making cross-attention a natural choice as in CrossMPT. For MIMO, constraints are distributed across two isomorphic subgraphs: the linear constraint subgraph $\mathcal{T}_{\text{lin}}$ and the symbolic subgraph of symbol estimates $\mathcal{T}_{\text{sym}}$. Accordingly, SGT integrates self-attention to encode contextual consistency within each subgraph and cross-attention to exchange messages between them, thereby unifying contextual encoding with constraint-driven message passing.
  • Figure 3: (a) Training Loss of 8$\times$8 MIMO (b) BER performance of 8$\times$8, 8$\times$16, 16$\times$16 Rayleigh Fading MIMO channel settings, SGT consistently outperforms Deep-Unfolded method OAMPNet2 he2020oampnet2 and Transformer-based MIMO ahmed2025transformer.