Table of Contents
Fetching ...

Gaining Cross-Platform Parallelism for HAL's Molecular Dynamics Package using SYCL

Viktor Skoblin, Felix Höfling, Steffen Christgau

TL;DR

A case study of HAL's MD package that has been successfully migrated from CUDA to SYCL is presented and the different strategies that were followed in the process of porting the code are described.

Abstract

Molecular dynamics simulations are one of the methods in scientific computing that benefit from GPU acceleration. For those devices, SYCL is a promising API for writing portable codes. In this paper, we present the case study of "HAL's MD package" that has been successfully migrated from CUDA to SYCL. We describe the different strategies that we followed in the process of porting the code. Following these strategies, we achieved code portability across major GPU vendors. Depending on the actual kernels, both significant performance improvements and regressions are observed. As a side effect of the migration process, we obtained impressing speedups also for execution on CPUs.

Gaining Cross-Platform Parallelism for HAL's Molecular Dynamics Package using SYCL

TL;DR

A case study of HAL's MD package that has been successfully migrated from CUDA to SYCL is presented and the different strategies that were followed in the process of porting the code are described.

Abstract

Molecular dynamics simulations are one of the methods in scientific computing that benefit from GPU acceleration. For those devices, SYCL is a promising API for writing portable codes. In this paper, we present the case study of "HAL's MD package" that has been successfully migrated from CUDA to SYCL. We describe the different strategies that we followed in the process of porting the code. Following these strategies, we achieved code portability across major GPU vendors. Depending on the actual kernels, both significant performance improvements and regressions are observed. As a side effect of the migration process, we obtained impressing speedups also for execution on CPUs.
Paper Structure (15 sections, 2 figures)

This paper contains 15 sections, 2 figures.

Figures (2)

  • Figure 1: Signal--slots connections and data dependencies in a minimal MD simulation. Each box represents a module in HAL's MD package, small rectangles represent signals, and ellipses refer to class methods, which may serve as slot functions.
  • Figure 2: Performance data for all-to-all interaction simulation. (a) Speedups for the SYCL version on Intel Cascade Lake CPU compared to the sequential baseline and the SYCL version using a single thread, and parallel efficiency for the latter. (b) Performance of the SYCL and native version on different GPUs (higher number is better) for all-to-all interaction. (c) Slowdown of the SYCL version vs. CUDA for different simulation temperatures with truncated pair interaction.