NPB-Rust: NAS Parallel Benchmarks in Rust
Eduardo M. Martins, Leonardo G. Faé, Renato B. Hoffmann, Lucas S. Bianchessi, Dalvan Griebler
TL;DR
This work introduces NPB-Rust, a Rust-based port of the NAS Parallel Benchmarks, and analyzes Rust’s ability to express and execute intensive arithmetic and CFD-inspired workloads. It develops both a sequential Rust port and a Rayon-based data-parallel version, and benchmarks them against Fortran and C++ (OpenMP) implementations to assess performance, scalability, and programmability. The study finds that Rust's sequential performance is close to Fortran and better than C++, while Rayon generally lags OpenMP in parallel performance, though it offers memory-safety benefits and different scheduling characteristics. These results establish Rust as a viable host for HPC benchmarks, while highlighting areas where alternative parallelism strategies could yield further gains in complex kernels. The NPB-Rust suite provides a foundation for future Rust-based HPC evaluations across libraries, architectures, and parallel paradigms.
Abstract
Parallel programming often requires developers to handle complex computational tasks that can yield many errors in its development cycle. Rust is a performant low-level language that promises memory safety guarantees with its compiler, making it an attractive option for HPC application developers. We identified that the Rust ecosystem could benefit from more comprehensive scientific benchmark suites for standardizing comparisons and research. The NAS Parallel Benchmarks (NPB) is a standardized suite for evaluating various hardware aspects and is often used to compare different frameworks for parallelism. Therefore, our contributions are a Rust version of NPB, an analysis of the expressiveness and performance of the language features, and parallelization strategies. We compare our implementation with consolidated sequential and parallel versions of NPB. Experimental results show that Rust's sequential version is 1.23\% slower than Fortran and 5.59\% faster than C++, while Rust with Rayon was slower than both Fortran and C++ with OpenMP.
