Table of Contents
Fetching ...

Accelerator-assisted Floating-point ASIP for Communication and Positioning in Massive MIMO Systems

Mohammad Attari, Ove Edfors, Liang Liu

TL;DR

This work tackles the challenge of real-time joint communication and positioning in massive MIMO networks by proposing an accelerator-assisted ASIP. The architecture combines a scalar RISC-V core, a 16-lane vector core, a 16×16 systolic array, and a CNN accelerator, all powered by a dual-vector-memory system to sustain parallel workloads. It employs $bfloat16$ for vector paths and complex matrix operations, a dedicated matrix layout to optimize data movement, and compiler intrinsics to drive the systolic array. Evaluations on a $22$ nm FD-SOI process at $800$ MHz show a peak detection throughput of $2.1$ Gb/s in a $128\times 16$ system at $50$ km/h and about $390$ positionings/s, with an area around $2.5$ mm$^2$ and power near 900 mW, demonstrating competitive area efficiency and practical viability for integrated baseband processing.

Abstract

This paper presents an implementation of a floating-point-capable application-specific instruction set processor (ASIP) for both communication and positioning tasks using the massive multiple-input multiple-output (MIMO) technology. The ASIP is geared with vector processing capabilities in the form of single instruction multiple data (SIMD). A dual-pronged accelerator composition assists the processor to tame the heavier mathematical workloads. A standalone systolic array accelerator accompanies the processor to aid with matrix multiplications. A parallel vector memory subsystem provides functionalities to both the processor and the systolic array. Additionally, A convolutional neural network (CNN) module accelerator, which is paired with its own separate vector memory, works hand in glove with the processor to take on the positioning task. The processor is synthesized in 22 nm fully depleted silicon-on-insulator (FD-SOI) technology running at a clock frequency of 800 MHz. The system achieves a maximum detection throughput of 2.1 Gb/s in a 128x16 massive MIMO system for the user equipment (UE) speed of 50km/h. The localization throughput settles at around 390 positionings/s.

Accelerator-assisted Floating-point ASIP for Communication and Positioning in Massive MIMO Systems

TL;DR

This work tackles the challenge of real-time joint communication and positioning in massive MIMO networks by proposing an accelerator-assisted ASIP. The architecture combines a scalar RISC-V core, a 16-lane vector core, a 16×16 systolic array, and a CNN accelerator, all powered by a dual-vector-memory system to sustain parallel workloads. It employs for vector paths and complex matrix operations, a dedicated matrix layout to optimize data movement, and compiler intrinsics to drive the systolic array. Evaluations on a nm FD-SOI process at MHz show a peak detection throughput of Gb/s in a system at km/h and about positionings/s, with an area around mm and power near 900 mW, demonstrating competitive area efficiency and practical viability for integrated baseband processing.

Abstract

This paper presents an implementation of a floating-point-capable application-specific instruction set processor (ASIP) for both communication and positioning tasks using the massive multiple-input multiple-output (MIMO) technology. The ASIP is geared with vector processing capabilities in the form of single instruction multiple data (SIMD). A dual-pronged accelerator composition assists the processor to tame the heavier mathematical workloads. A standalone systolic array accelerator accompanies the processor to aid with matrix multiplications. A parallel vector memory subsystem provides functionalities to both the processor and the systolic array. Additionally, A convolutional neural network (CNN) module accelerator, which is paired with its own separate vector memory, works hand in glove with the processor to take on the positioning task. The processor is synthesized in 22 nm fully depleted silicon-on-insulator (FD-SOI) technology running at a clock frequency of 800 MHz. The system achieves a maximum detection throughput of 2.1 Gb/s in a 128x16 massive MIMO system for the user equipment (UE) speed of 50km/h. The localization throughput settles at around 390 positionings/s.

Paper Structure

This paper contains 20 sections, 6 equations, 11 figures, 9 tables.

Figures (11)

  • Figure 1: Task mapping.
  • Figure 2: Resource grid.
  • Figure 3: Bird's-eye view of the innards of the processor, illustrating the stylized structure of the RISC-V processor and the vector core surrounded with the different memories and the two accelerators.
  • Figure 4: Memory controller.
  • Figure 5: Matrix layout in the parallel vector memory for a 16 $\times$ 16 matrix followed by a 32 $\times$ 32 matrix.
  • ...and 6 more figures