Investigations of multi-socket high core count RISC-V for HPC workloads
Nick Brown, Christopher Day
TL;DR
The paper evaluates the viability of a dual-socket, high-core-count RISC-V HPC node based on the Sophon SG2042. It benchmarks NASA's NAS Parallel Benchmark suite on the system to isolate core and memory subsystem effects and to compare against x86 and ARM-based HPC CPUs. The findings confirm persistent memory-bandwidth and latency limitations on the SG2042, but show that multi-socket configurations improve scaling and overall throughput, particularly for compute-bound kernels, while exposing inter-socket NUMA overheads. The work demonstrates that with careful system design, SG2042-based multi-socket nodes can offer favorable performance-per-dollar for HPC workloads, motivating further tuning and ecosystem development.
Abstract
Whilst RISC-V has become popular in fields such as embedded computing, it is yet to find mainstream success in High Performance Computing (HPC). However, the 64-core RISC-V Sophon SG2042 is a potential game changer as it provides a commodity available CPU with much higher core count than existing technologies. In this work we benchmark the SG2042 CPU hosted in an experimental, dual-socket, system to explore the performance properties of the CPU when running a common HPC benchmark suite across sockets. Earlier benchmarks found that, on the Milk-V Pioneer workstation, whilst the SG2042 performs well for compute bound codes, it struggles when pressure is placed on the memory subsystem. The performance results reported here confirm that, even on a different system, these memory performance limitations are still present and hence inherent in the CPU. However, a multi-socket configuration does enable the CPU to scale to a larger number of threads which, in the main, delivers an improvement in performance and-so this is a realistic system configuration for the HPC community.
