Fast Entropy Decoding for Sparse MVM on GPUs

Emil Schätzle; Tommaso Pegolotti; Markus Püschel

Fast Entropy Decoding for Sparse MVM on GPUs

Emil Schätzle, Tommaso Pegolotti, Markus Püschel

TL;DR

dtANS, the new lossless compression method that improves the entropy coding technique of asymmetric numeral systems (ANS) specifically for fast parallel GPU decoding when used in tandem with SpMVM, is applied on the widely used CSR format.

Abstract

We present a novel, practical approach to speed up sparse matrix-vector multiplication (SpMVM) on GPUs. The novel key idea is to apply lossless entropy coding to further compress the sparse matrix when stored in one of the commonly supported formats. Our method is based on dtANS, our new lossless compression method that improves the entropy coding technique of asymmetric numeral systems (ANS) specifically for fast parallel GPU decoding when used in tandem with SpMVM. We apply dtANS on the widely used CSR format and present extensive benchmarks on the SuiteSparse collection of matrices against the state-of-the-art cuSPARSE library. On matrices with at least 2^(15) entries and at least 10 entries per row on average, our compression reduces the matrix size over the smallest cuSPARSE format (CSR, COO and SELL) in almost all cases and up to 11.77 times. Further, we achieve an SpMVM speedup for the majority of matrices with at least 2^(25) nonzero entries. The best speedup is 3.48x. We also show that we can improve over the AI-based multi-format AlphaSparse in an experiment that is limited due to its extreme computation overhead. We provide our code as an open source C++/CUDA header library, which includes both compression and multiplication kernels.

Fast Entropy Decoding for Sparse MVM on GPUs

TL;DR

Abstract

Paper Structure (25 sections, 12 equations, 9 figures, 3 tables, 3 algorithms)

This paper contains 25 sections, 12 equations, 9 figures, 3 tables, 3 algorithms.

Introduction
overview
Encoding
Decoding and SpMVM
Background
Sparse Matrix-Vector Multiplication
Entropy Coding
Mixed radix numeral systems
Encoding with tANS
Decoding with tANS
SpMVM with dtANS
Delta-encoding
CSR-dtANS
Table construction in dtANS
Decoding in dtANS
...and 10 more sections

Figures (9)

Figure 1: Encoding into CSR-dtANS (left) and SpMVM on CSR-dtANS (right)
Figure 2: Example for the CSR format with 6 nonzeros
Figure 3: Example of entropy encoding with tANS
Figure 4: Entropy reduction via delta-encoding for three random graph models with increasing number of nodes. Model parameters are chosen to keep the average degree at 5, 10, and 20.
Figure 5: Mapping of data to threads and dtANS representation
...and 4 more figures

Fast Entropy Decoding for Sparse MVM on GPUs

TL;DR

Abstract

Fast Entropy Decoding for Sparse MVM on GPUs

Authors

TL;DR

Abstract

Table of Contents

Figures (9)