Table of Contents
Fetching ...

Tuning of Vectorization Parameters for Molecular Dynamics Simulations in AutoPas

Luis Gall, Samuel James Newcome, Fabio Alexander Gratl, Markus Mühlhäußer, Manish Kumar Mishra, Hans-Joachim Bungartz

TL;DR

This work studies how the order of loading particle data into SIMD registers affects force calculations in AutoPas and extends AutoPas with runtime, energy-aware auto-tuning to select the fastest vectorization order under changing simulation conditions. By examining multiple neighbor identification algorithms, traversals, and data layouts, the authors show that the optimal vectorization pattern can vary during a simulation and across workloads. Benchmark results demonstrate meaningful speedups and energy differences depending on cutoffs, cluster sizes, and Newton's third law usage, highlighting the necessity of dynamic tuning for performance portability. The findings underscore that runtime-aware vectorization tuning can significantly improve time-to-solution and energy efficiency in MD simulations across diverse architectures and scenarios.

Abstract

Molecular Dynamics simulations can help scientists to gather valuable insights for physical processes on an atomic scale. This work explores various techniques for SIMD vectorization to improve the pairwise force calculation between molecules in the scope of the particle simulation library AutoPas. The focus lies on the order in which particle values are loaded into vector registers to achieve the most optimal performance regarding execution time or energy consumption. As previous work indicates that the optimal MD algorithm can change during runtime, this paper investigates simulation-specific parameters like particle density and the impact of the neighbor identification algorithms, which distinguishes this work from related projects. Furthermore, AutoPas' dynamic tuning mechanism is extended to choose the optimal vectorization order during runtime. The benchmarks show that considering different particle interaction orders during runtime can lead to a considerable performance improvement for the force calculation compared to AutoPas' previous approach.

Tuning of Vectorization Parameters for Molecular Dynamics Simulations in AutoPas

TL;DR

This work studies how the order of loading particle data into SIMD registers affects force calculations in AutoPas and extends AutoPas with runtime, energy-aware auto-tuning to select the fastest vectorization order under changing simulation conditions. By examining multiple neighbor identification algorithms, traversals, and data layouts, the authors show that the optimal vectorization pattern can vary during a simulation and across workloads. Benchmark results demonstrate meaningful speedups and energy differences depending on cutoffs, cluster sizes, and Newton's third law usage, highlighting the necessity of dynamic tuning for performance portability. The findings underscore that runtime-aware vectorization tuning can significantly improve time-to-solution and energy efficiency in MD simulations across diverse architectures and scenarios.

Abstract

Molecular Dynamics simulations can help scientists to gather valuable insights for physical processes on an atomic scale. This work explores various techniques for SIMD vectorization to improve the pairwise force calculation between molecules in the scope of the particle simulation library AutoPas. The focus lies on the order in which particle values are loaded into vector registers to achieve the most optimal performance regarding execution time or energy consumption. As previous work indicates that the optimal MD algorithm can change during runtime, this paper investigates simulation-specific parameters like particle density and the impact of the neighbor identification algorithms, which distinguishes this work from related projects. Furthermore, AutoPas' dynamic tuning mechanism is extended to choose the optimal vectorization order during runtime. The benchmarks show that considering different particle interaction orders during runtime can lead to a considerable performance improvement for the force calculation compared to AutoPas' previous approach.

Paper Structure

This paper contains 25 sections, 1 equation, 8 figures, 1 algorithm.

Figures (8)

  • Figure 1: Chosen neighbor identification algorithms are visualized. The blue circle highlights the cutoff distance. The blue and orange area highlights the search space in which neighboring particles, also in orange, are considered.
  • Figure 2: The orange cell is the currently processed cell by the domain traversal. The blue area and the black arrows indicate the pairwise interactions between particles in the base cell and their neighbor cells Gratl_Autopas_2022
  • Figure 3: Force calculation control flow of AutoPas. The particle containers carry out the neighbor identification and decide on the data structures for storing the particles. The (parallel) domain traversals pass the concrete particle lists to the force functor that calculates the pairwise forces.
  • Figure 4: Vector register allocation for the different vectorization patterns for an 8-way SIMD machine. The upper blue registers represent the allocation of the i-particles obtained by taking particles from the outer loop. The lower orange registers visualize the allocation of the j-particles taken from the inner loop.
  • Figure 5: Multiple different comparisons of parameters that affect the choice of the optimal vectorization pattern.
  • ...and 3 more figures