Table of Contents
Fetching ...

Accelerating Garfield++ with CUDA

T. Neep, K. Nikolopoulos, M. Slater

TL;DR

The acceleration of one of Garfield++'s most demanding algorithms, AvalancheMicroscopic, by porting it to graphics processing units using NVIDIA's CUDA framework is described, thereby enabling more efficient and detailed detector simulations.

Abstract

Garfield++ is extensively used within the gaseous detector community for comprehensive detector simulations, supporting the full experimental life cycle from design to operation and calibration. The emergence of micro-pattern gaseous detectors has necessitated computationally intensive microscopic avalanche simulations. The acceleration of one of Garfield++'s most demanding algorithms, AvalancheMicroscopic, by porting it to graphics processing units using NVIDIA's CUDA framework is described. The modifications are integrated into the Garfield++ codebase and are accessible to end users with only minor adjustments to their existing code. Benchmark results demonstrate substantial speed-up, especially for high-gain avalanches involving thousands of electrons, thereby enabling more efficient and detailed detector simulations.

Accelerating Garfield++ with CUDA

TL;DR

The acceleration of one of Garfield++'s most demanding algorithms, AvalancheMicroscopic, by porting it to graphics processing units using NVIDIA's CUDA framework is described, thereby enabling more efficient and detailed detector simulations.

Abstract

Garfield++ is extensively used within the gaseous detector community for comprehensive detector simulations, supporting the full experimental life cycle from design to operation and calibration. The emergence of micro-pattern gaseous detectors has necessitated computationally intensive microscopic avalanche simulations. The acceleration of one of Garfield++'s most demanding algorithms, AvalancheMicroscopic, by porting it to graphics processing units using NVIDIA's CUDA framework is described. The modifications are integrated into the Garfield++ codebase and are accessible to end users with only minor adjustments to their existing code. Benchmark results demonstrate substantial speed-up, especially for high-gain avalanches involving thousands of electrons, thereby enabling more efficient and detailed detector simulations.

Paper Structure

This paper contains 12 sections, 6 figures.

Figures (6)

  • Figure 1: An illustration of GPU thread occupancy in the (a) ideal case and (b) an example AvalancheMicroscopic case. Rows indicate iteration number and each column represents a GPU thread. Squares are filled when work is being done, and empty when idle. Solid arrows represent the continuation of work on a GPU thread and dashed arrows represent new electrons being created in the avalanche process. The filled octagon represents the tracking of an electron terminating.
  • Figure 2: A simplified outline of the AvalancheMicroscopic algorithm. The electron transport and stack processing stages of the algorithm are represented by the dashed and dotted boxes, respectively. The entire algorithm is repeated until there are no active electrons remaining.
  • Figure 3: An example of the flow of data from the CPU to the GPU. Data is loaded from disk into the Component class on CPU accessible memory. The data is transfered to the ComponentGPU class when the TransportElectrons method of AvalancheMicroscopic is first run. At this point the sizes of the data structures are known.
  • Figure 4: (a) Number of electron endpoints for 5,000 single electron events. (b) Time to run the single GEM example for different numbers of starting electrons (averaged over several runs). The small scatter in points is due to fluctuations in the avalanche process.
  • Figure 5: Time to run the single GEM example.
  • ...and 1 more figures