On Advanced Monte Carlo Methods for Linear Algebra on Advanced Accelerator Architectures

Anton Lebedev; Vassil Alexandrov

On Advanced Monte Carlo Methods for Linear Algebra on Advanced Accelerator Architectures

Anton Lebedev, Vassil Alexandrov

Abstract

In this paper we present computational experiments with the Markov Chain Monte Carlo Matrix Inversion ($(\text{MC})^2\text{MI}$) on several accelerator architectures and investigate their impact on performance and scalability of the method. The method is used as a preconditioner and for solving the corresponding system of linear equations iterative methods, such as generalized minimal residuals (GMRES) or bi-conjugate gradient (stabilized) (BICGstab), are used. Numerical experiments are carried out to highlight the benefits and deficiencies of both approaches and to assess their overall usefulness in light of scalability of the method.

On Advanced Monte Carlo Methods for Linear Algebra on Advanced Accelerator Architectures

Abstract

In this paper we present computational experiments with the Markov Chain Monte Carlo Matrix Inversion (

) on several accelerator architectures and investigate their impact on performance and scalability of the method. The method is used as a preconditioner and for solving the corresponding system of linear equations iterative methods, such as generalized minimal residuals (GMRES) or bi-conjugate gradient (stabilized) (BICGstab), are used. Numerical experiments are carried out to highlight the benefits and deficiencies of both approaches and to assess their overall usefulness in light of scalability of the method.

Paper Structure (33 sections, 4 equations, 8 figures, 1 table)

This paper contains 33 sections, 4 equations, 8 figures, 1 table.

Introduction
Related Work
Using SParse Approximate Inverse as Preconditioner (SPAI)
Monte Carlo Approach
Algorithm
Parallelization details and issues
MPI implementaion
GPU implementaion
Algorithmic Modifications
Matrix Reduction
Implementation Specifics - MPI
Implementation Specifics - GPU
Numerical Experiments
Execution Environment
Fitness of purpose
...and 18 more sections

Figures (8)

Figure 1: Total execution time for rdb2048_noL with $7.5\%$ of the value range of the entries removed.
Figure 2: Total execution time for nonsym_r3_a11 with $7.5\%$ of the value range of the entries removed.
Figure 3: Execution time of the preconditioner computation.
Figure 4: Total execution time for nonsym_r3_a11. When using GMRES with a termination condition $\tfrac{\Vert r\Vert_2}{\Vert b\Vert_2}\leq 10^{-6}$ and a precision of $\epsilon = 0.0625$ for the computation of the preconditioner and removing $2.5\%$ of the entries smallest in magnitude.
Figure 5: Execution time of the preconditioner computation for different matrices. Transition probabilities were computed by the master process and broadcast to the workers.
...and 3 more figures

On Advanced Monte Carlo Methods for Linear Algebra on Advanced Accelerator Architectures

Abstract

On Advanced Monte Carlo Methods for Linear Algebra on Advanced Accelerator Architectures

Authors

Abstract

Table of Contents

Figures (8)