Table of Contents
Fetching ...

Distributed Semi-Speculative Parallel Anisotropic Mesh Adaptation

Kevin Garner, Polykarpos Thomadakis, Nikos Chrisochoides

TL;DR

The presented method is shown to generate meshes with comparable quality and performance to existing state-of-the-art HPC meshing software.

Abstract

This paper presents a distributed memory method for anisotropic mesh adaptation that is designed to avoid the use of collective communication and global synchronization techniques. In the presented method, meshing functionality is separated from performance aspects by utilizing a separate entity for each - a multicore cc-NUMA-based (shared memory) mesh generation software and a parallel runtime system that is designed to help applications leverage the concurrency offered by emerging high-performance computing (HPC) architectures. First, an initial mesh is decomposed and its interface elements (subdomain boundaries) are adapted on a single multicore node (shared memory). Subdomains are then distributed among the nodes of an HPC cluster so that their interior elements are adapted while interface elements (already adapted) remain frozen to maintain mesh conformity. Lessons are presented regarding some re-designs of the shared memory software and how its speculative execution model is utilized by the distributed memory method to achieve good performance. The presented method is shown to generate meshes (of up to approximately 1 billion elements) with comparable quality and performance to existing state-of-the-art HPC meshing software.

Distributed Semi-Speculative Parallel Anisotropic Mesh Adaptation

TL;DR

The presented method is shown to generate meshes with comparable quality and performance to existing state-of-the-art HPC meshing software.

Abstract

This paper presents a distributed memory method for anisotropic mesh adaptation that is designed to avoid the use of collective communication and global synchronization techniques. In the presented method, meshing functionality is separated from performance aspects by utilizing a separate entity for each - a multicore cc-NUMA-based (shared memory) mesh generation software and a parallel runtime system that is designed to help applications leverage the concurrency offered by emerging high-performance computing (HPC) architectures. First, an initial mesh is decomposed and its interface elements (subdomain boundaries) are adapted on a single multicore node (shared memory). Subdomains are then distributed among the nodes of an HPC cluster so that their interior elements are adapted while interface elements (already adapted) remain frozen to maintain mesh conformity. Lessons are presented regarding some re-designs of the shared memory software and how its speculative execution model is utilized by the distributed memory method to achieve good performance. The presented method is shown to generate meshes (of up to approximately 1 billion elements) with comparable quality and performance to existing state-of-the-art HPC meshing software.
Paper Structure (28 sections, 7 equations, 19 figures, 12 tables, 5 algorithms)

This paper contains 28 sections, 7 equations, 19 figures, 12 tables, 5 algorithms.

Figures (19)

  • Figure 1: The CDT3D pipeline of operations utilized within the presented method for anisotropic mesh adaptation is shown. By default, Vertex Smoothing (dotted) is not utilized during mesh adaptation but is available to potentially further improve mesh quality.
  • Figure 2: Shown is a visual example of a two-subdomain decomposed delta wing geometry as it is processed by the distributed memory CDT3D method.
  • Figure 3: Shown is a 2D example of constrained elements that cannot undergo point creation/insertion because they are adjacent to pseudo-inactive elements.
  • Figure 4: Shown is a 2D example of an edge and points that would not be allowed to undergo removal or smoothing due to preemptively locked vertices.
  • Figure 5: Shown is a 2D example of elements that cannot undergo flips if they are pseudo-inactive or defined by preemptively locked vertices.
  • ...and 14 more figures