MMGaP: Multi-User MIMO Detection and Precoding using GPU-assisted Physics-inspired Computation
Abhishek Kumar Singh, Kyle Jamieson
TL;DR
MMGaP tackles the gap between theory and practice for physics-inspired MIMO processing in 5G by implementing a GPU-based CIM-driven MU-MIMO detector and downlink Vector Perturbation precoding. It maps MIMO detection and precoding to Ising optimization, executing multiple anneals on bare-metal CUDA kernels packaged as TensorFlow ops and integrated with NVIDIA Aerial CUDA to achieve line-rate performance. The approach demonstrates substantial uplink and downlink throughput gains over traditional linear baselines (e.g., ~50 Mbps per UE uplink, ~100 Mbps per UE downlink for 8×8 at 100 MHz) and scales to larger MIMO sizes (e.g., 16×16), with detailed microbenchmarks on A100/H100/L4 GPUs. The results indicate that physics-inspired MIMO processing on commodity GPUs is feasible for real-world 5G deployments and can be integrated into existing GPU-accelerated stacks.
Abstract
Physics-inspired and quantum compute based methods for processing in the physical layer of next-generation cellular radio access networks have demonstrated theoretical advances in spectral efficiency in recent years, but have stopped short of practical realization on commodity processors, leaving a gap between the throughput practical systems can achieve and the projected throughput the state-of-the-art should achieve. To fill this gap, this paper proposes MMGaP, an uplink multi-user MIMO detector and downlink Vector perturbation precoder for next-generation cellular networks. MMGaP realizes these large MIMO processing algorithms for the first time on bare-metal CUDA kernels that scale to run on large GPU processing platforms, and can be packaged as TensorFlow modules, allowing easy integration with a variety of systems. We integrate MMGaP with NVIDIA's software-defined, GPU-accelerated 5G platform and evaluate its performance against the state-of-the-art. In a 5G cellular network using 100 MHz of radio bandwidth, eight antennas at the base station and eight concurrent users, we show that MMGaP improves uplink throughput by approximately 50 Mbps per user and downlink throughput by 100 Mbps per user over a wide range of SNR. We further show that MMGaP can also support larger MIMO sizes: for 16 antennas at the base station and 16 concurrent users, MMGaP provides more than 50 Mbps higher uplink throughput per user. We measure the execution time of MMGaP on different NVIDIA GPUs and show that it can operate at line-rate and meet the timing requirements of state-of-the-art 5G systems.
