A Systematic Literature Survey of Sparse Matrix-Vector Multiplication
Jianhua Gao, Bingjie Liu, Weixing Ji, Hua Huang
TL;DR
Sparse Matrix-Vector Multiplication ($\mathbf{y} = \mathbf{A} \mathbf{x}$) is a foundational kernel spanning scientific computing and graph analytics. The paper comprehensively surveys compression formats, classical and learning-based optimization, mixed-precision strategies, and architecture-aware implementations across CPU, GPU, FPGA, PIM, and distributed systems, supported by a broad performance evaluation. Key contributions include taxonomy of formats (CSR, HYB, CSR5, etc.), auto-tuning and ML-based format/parameter selection, and insights into preprocessing and communication overheads, with guidance on future research directions. The work provides a practical roadmap for selecting and designing SpMV solutions tailored to matrix structure and target hardware, highlighting persistent challenges in irregular sparsity and cross-device scalability.
Abstract
Sparse matrix-vector multiplication (SpMV) is a crucial computing kernel with widespread applications in iterative algorithms. Over the past decades, research on SpMV optimization has made remarkable strides, giving rise to various optimization contributions. However, the comprehensive and systematic literature survey that introduces, analyzes, discusses, and summarizes the advancements of SpMV in recent years is currently lacking. Aiming to fill this gap, this paper compares existing techniques and analyzes their strengths and weaknesses. We begin by highlighting two representative applications of SpMV, then conduct an in-depth overview of the important techniques that optimize SpMV on modern architectures, which we specifically classify as classic, auto-tuning, machine learning, and mixed-precision-based optimization. We also elaborate on the hardware-based architectures, including CPU, GPU, FPGA, processing in Memory, heterogeneous, and distributed platforms. We present a comprehensive experimental evaluation that compares the performance of state-of-the-art SpMV implementations. Based on our findings, we identify several challenges and point out future research directions. This survey is intended to provide researchers with a comprehensive understanding of SpMV optimization on modern architectures and provide guidance for future work.
