Table of Contents
Fetching ...

Using Exascale Computing to Explain the Delicate Balance of Nuclear Forces in the Universe

M. A. Clark, A. Hanlon, D. Howarth, B. Joo, S. Krieg, D. McDougall, A. Meyer, H. Monge-Camacho, C. Morningstar, S. Park, F. Romero-López, P. M. Vranas, A. Walker-Loud

TL;DR

This work addresses how the delicate balance of nuclear forces emerges from QCD by targeting the deuteron binding energy and its cosmological implications for Big Bang Nucleosynthesis. It introduces a GPU-accelerated lattice QCD workflow built on LapH smearing (LapH) and the QUDA library, augmented with a batched Lanczos eigensolver and batched linear solves to enable two-nucleon calculations near physical quark masses on exascale hardware. The key contributions include the first GPU-enabled LapH implementation for two-nucleon studies, the introduction of quda_laph for end-to-end workflow management, and a scalable, memory-efficient eigenvector and projection strategy that yields up to approx 240× speedups over CPU baselines. The results demonstrate near-ideal weak scaling on multiple exascale-class systems and establish a practical pathway to connect QCD predictions with nuclear phenomenology, improving precision tests of the Standard Model and informing cosmological questions about the universe's hydrogen abundance.

Abstract

The vast majority of visible matter in our universe comes from protons and neutrons (the nucleons). Nucleon interactions are fundamental to how the universe developed after the Big Bang and govern all nuclear phenomena. The subtle balance in how two nucleons interact shapes the universe's hydrogen content that is central to our existence. Our objective is to compute the interaction strength while varying the parameters of nature to understand how delicate this balance is. We developed a new code using sophisticated physics algorithms and a highly optimized library for simulations on CPU-GPU parallel architectures. It has excellent weak scaling and impressive linear scaling for a fixed problem size with increasing number of nodes up to El Capitan's full $\sim$11,000 nodes. On Alps, El Capitan, Frontier, Jupiter, and Perlmutter supercomputers we achieve a maximum disruptive speed-up of $\sim$240 times the previous state-of-the-art, signaling a new era of supercomputing.

Using Exascale Computing to Explain the Delicate Balance of Nuclear Forces in the Universe

TL;DR

This work addresses how the delicate balance of nuclear forces emerges from QCD by targeting the deuteron binding energy and its cosmological implications for Big Bang Nucleosynthesis. It introduces a GPU-accelerated lattice QCD workflow built on LapH smearing (LapH) and the QUDA library, augmented with a batched Lanczos eigensolver and batched linear solves to enable two-nucleon calculations near physical quark masses on exascale hardware. The key contributions include the first GPU-enabled LapH implementation for two-nucleon studies, the introduction of quda_laph for end-to-end workflow management, and a scalable, memory-efficient eigenvector and projection strategy that yields up to approx 240× speedups over CPU baselines. The results demonstrate near-ideal weak scaling on multiple exascale-class systems and establish a practical pathway to connect QCD predictions with nuclear phenomenology, improving precision tests of the Standard Model and informing cosmological questions about the universe's hydrogen abundance.

Abstract

The vast majority of visible matter in our universe comes from protons and neutrons (the nucleons). Nucleon interactions are fundamental to how the universe developed after the Big Bang and govern all nuclear phenomena. The subtle balance in how two nucleons interact shapes the universe's hydrogen content that is central to our existence. Our objective is to compute the interaction strength while varying the parameters of nature to understand how delicate this balance is. We developed a new code using sophisticated physics algorithms and a highly optimized library for simulations on CPU-GPU parallel architectures. It has excellent weak scaling and impressive linear scaling for a fixed problem size with increasing number of nodes up to El Capitan's full 11,000 nodes. On Alps, El Capitan, Frontier, Jupiter, and Perlmutter supercomputers we achieve a maximum disruptive speed-up of 240 times the previous state-of-the-art, signaling a new era of supercomputing.

Paper Structure

This paper contains 14 sections, 3 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: We present the wall-clock time of full production calculations in a weak scaling study, including I/O, on El Capitan. The black dashed vertical line indicates 10,752 nodes.
  • Figure 2: We present the wall-clock time of full production calculations in a weak scaling study, including I/O, on Alps utilizing the batched linear solver optimizations
  • Figure 3: "Resource scaling" of the global problem on El Capitan. This includes the full resource cost, such as I/O and job scheduler overhead. The black dashed vertical line indicates 10,752 nodes.
  • Figure 4: Strong scaling analysis (including I/O), indicating that our choice of a sub-job size of 8 nodes is optimal.
  • Figure 5: Comparison of the previous state-of-the-art (red) with some of today's exascale-class supercomputers (blue/green). For our code we measure up to a factor of $\sim$240 speedup. Note that the node-hour measurements are not suitable as a direct comparative benchmark between the new supercomputers due to variations in the job parameters from one machine to another, such as the number of nodes and/or ranks used.