PSCToolkit: solving sparse linear systems with a large number of GPUs
Pasqua D'Ambra, Fabio Durastante, Salvatore Filippone
TL;DR
PSCToolkit delivers a GPU-accelerated toolkit for solving large sparse linear systems on HPC platforms, targeting symmetric positive-definite problems and scalability to thousands of GPUs. It combines three components—PSBLAS, AMG4PSBLAS, and PSBLAS extensions—with GPU-aware memory management via mold variables and the Hacked ELLPACK format to enable efficient Krylov solvers preconditioned by AMG. Through extensive experiments on EuroHPC Leonardo, the authors demonstrate weak and strong scaling, competitive iterations and solve times compared to AMGX, and emphasize low operator complexity to sustain performance as GPU counts grow. The work points to future enhancements in OpenMP/OpenACC integration, polynomial smoothers, and broader hardware support, aiming to improve portability and maintainability while preserving high performance.
Abstract
In this chapter, we describe the Parallel Sparse Computation Toolkit (PSCToolkit), a suite of libraries for solving large-scale linear algebra problems in an HPC environment. In particular, we focus on the tools provided for the solution of symmetric and positive-definite linear systems using up to 8192 GPUs on the EuroHPC-JU Leonardo supercomputer. PSCToolkit is an ongoing mathematical software project aimed at exploiting the extreme computational speed of current supercomputers for relevant problems in Computational and Data Science. The toolkit is designed for node-level efficiency, flexibility and usability, supporting integration with both Fortran and C/C++, enabling researchers and developers from diverse computational backgrounds to leverage its powerful capabilities.
