Accelerator-assisted Floating-point ASIP for Communication and Positioning in Massive MIMO Systems
Mohammad Attari, Ove Edfors, Liang Liu
TL;DR
This work tackles the challenge of real-time joint communication and positioning in massive MIMO networks by proposing an accelerator-assisted ASIP. The architecture combines a scalar RISC-V core, a 16-lane vector core, a 16×16 systolic array, and a CNN accelerator, all powered by a dual-vector-memory system to sustain parallel workloads. It employs $bfloat16$ for vector paths and complex matrix operations, a dedicated matrix layout to optimize data movement, and compiler intrinsics to drive the systolic array. Evaluations on a $22$ nm FD-SOI process at $800$ MHz show a peak detection throughput of $2.1$ Gb/s in a $128\times 16$ system at $50$ km/h and about $390$ positionings/s, with an area around $2.5$ mm$^2$ and power near 900 mW, demonstrating competitive area efficiency and practical viability for integrated baseband processing.
Abstract
This paper presents an implementation of a floating-point-capable application-specific instruction set processor (ASIP) for both communication and positioning tasks using the massive multiple-input multiple-output (MIMO) technology. The ASIP is geared with vector processing capabilities in the form of single instruction multiple data (SIMD). A dual-pronged accelerator composition assists the processor to tame the heavier mathematical workloads. A standalone systolic array accelerator accompanies the processor to aid with matrix multiplications. A parallel vector memory subsystem provides functionalities to both the processor and the systolic array. Additionally, A convolutional neural network (CNN) module accelerator, which is paired with its own separate vector memory, works hand in glove with the processor to take on the positioning task. The processor is synthesized in 22 nm fully depleted silicon-on-insulator (FD-SOI) technology running at a clock frequency of 800 MHz. The system achieves a maximum detection throughput of 2.1 Gb/s in a 128x16 massive MIMO system for the user equipment (UE) speed of 50km/h. The localization throughput settles at around 390 positionings/s.
