SpComm3D: A Framework for Enabling Sparse Communication in 3D Sparse Kernels
Nabil Abubaker, Torsten Hoefler
TL;DR
SpComm3D tackles the scalability bottleneck of distributed-memory sparse kernels such as SDDMM and SpMM by enabling sparsity-aware communication in a 3D data-parallel setting. It introduces a general framework with three phases—PreComm, Compute, and PostComm—and three practical zero-copy communication schemes (SpC-BB, SpC-SB/RB, SpC-NB), with data distributed via Dist2D/Dist3D. The paper provides a lambda-based analysis of sparsity-aware communication, builds 3D SDDMM and SpMM algorithms with SpComm3D, and demonstrates up to 20x reductions in communication, memory, and runtime on up to 1800 processors across real-world matrices. The work enables substantial scalability for large-scale sparse ML and scientific computing workloads, with publicly available code.
Abstract
Existing 3D algorithms for distributed-memory sparse kernels suffer from limited scalability due to reliance on bulk sparsity-agnostic communication. While easier to use, sparsity-agnostic communication leads to unnecessary bandwidth and memory consumption. We present SpComm3D, a framework for enabling sparsity-aware communication and minimal memory footprint such that no unnecessary data is communicated or stored in memory. SpComm3D performs sparse communication efficiently with minimal or no communication buffers to further reduce memory consumption. SpComm3D detaches the local computation at each processor from the communication, allowing flexibility in choosing the best accelerated version for computation. We build 3D algorithms with SpComm3D for the two important sparse ML kernels: Sampled Dense-Dense Matrix Multiplication (SDDMM) and Sparse matrix-matrix multiplication (SpMM). Experimental evaluations on up to 1800 processors demonstrate that SpComm3D has superior scalability and outperforms state-of-the-art sparsity-agnostic methods with up to 20x improvement in terms of communication, memory, and runtime of SDDMM and SpMM. The code is available at: https://github.com/nfabubaker/SpComm3D
