Evaluation of POSIT Arithmetic with Accelerators
Naohito Nakasato, Yuki Murakami, Fumiya Kono, Maho Nakata
TL;DR
This work evaluates 32-bit POSIT arithmetic with the Posit($32$, $2$) format as hardware accelerators on FPGAs and GPUs for dense linear algebra. By extending MPLAPACK to Posit($32$, $2$) and implementing FLO-Posit-based FPGA cores alongside ported GPU kernels, the study demonstrates that POSIT can yield modest accuracy gains in the appropriate input regime and achieves substantial acceleration for GEMM and matrix decompositions. The results show Posit($32$, $2$) provides about $0.5$–$1.0$ extra digits of accuracy than binary32 in the golden zone, with LU and Cholesky decompositions benefiting from acceleration though performance and power characteristics vary across platforms. This work illuminates platform-dependent trade-offs between FPGAs and GPUs for POSIT-based linear algebra and informs future directions for dedicated POSIT hardware and broader arithmetic formats.
Abstract
We present an evaluation of 32-bit POSIT arithmetic through its implementation as accelerators on FPGAs and GPUs. POSIT, a floating-point number format, adaptively changes the size of its fractional part. We developed hardware designs for FPGAs and software for GPUs to accelerate linear algebra operations using Posit(32,2) arithmetic. Our FPGA- and GPU-based accelerators in Posit(32,2) arithmetic significantly accelerated the Cholesky and LU decomposition algorithms for dense matrices. In terms of numerical accuracy, Posit(32,2) arithmetic is approximately 0.5 - 1.0 digits more accurate than the standard 32-bit format, especially when the norm of the elements of the input matrix is close to 1. Evaluating power consumption, we observed that the power efficiency of the accelerators ranged between 0.043 - 0.076 Gflops/watts for the LU decomposition in Posit(32,2) arithmetic. The power efficiency of the latest GPUs as accelerators of Posit(32,2) arithmetic is better than that of the evaluated FPGA chip.
