Table of Contents
Fetching ...

Nyx-RT: Adaptive Ray Tracing in the Nyx Hydrodynamical Code

Nathan X. Marshak, Kathlynn Simotas, Zarija Lukić, Hyunbae Park, James Ahrens, Chris R. Johnson

TL;DR

This work addresses the challenge of accurately modeling inhomogeneous reionization within cosmological simulations by integrating a GPU-accelerated, adaptive ray tracing radiative transfer (RT) scheme into the Nyx hydrodynamics code via AMReX abstractions. The method combines adaptive ray tracing with a novel forward source-merging algorithm and a photon-conserving geometric overlap correction, enabling self-consistent radiation-hydro evolution at exascale scales. Key contributions include a portable, filter-based RT implementation, robust handling of low-density neighbor cells, and production-scale demonstrations (up to 4096 GPUs on a $4096^3$ grid) that show convergence in reionization history and Ly-$\alpha$ forest flux. The significance lies in delivering physically faithful RT within large cosmological volumes at high resolution, facilitating systematic studies of reionization with realistic source populations and feedback in a scalable, architecture-agnostic framework.

Abstract

Numerical methods for radiative transfer play a key role in modern-day astrophysics and cosmology, including study of the inhomogeneous reionization process. In this context, ray tracing methods are well-regarded for accuracy but notorious for high computational cost. In this work, we extend the capabilities of the Nyx N-body / hydrodynamics code, coupling radiation to gravitational and gas dynamics. We formulate adaptive ray tracing as a novel series of filters and transformations that can be used with AMReX particle abstractions, simplifying implementation and enabling portability across Exascale GPU architectures. To address computational cost, we present a new algorithm for merging sources, which significantly accelerates computation once reionization is well underway. Furthermore, we develop a novel prescription for geometric overlap correction with low-density neighbor cells. We perform verification and validation against standard analytic and numerical test problems. Finally, we demonstrate scaling to up to 1024 nodes and 4096 GPUs running multiphysics cosmological simulations, with 4096^3 Eulerian gas cells, 4096^3 dark matter particles, and ray tracing on a 1024^3 coarse grid. For these full cosmological simulations, we demonstrate convergence in terms of reionization history and post-ionization Lyman-alpha forest flux.

Nyx-RT: Adaptive Ray Tracing in the Nyx Hydrodynamical Code

TL;DR

This work addresses the challenge of accurately modeling inhomogeneous reionization within cosmological simulations by integrating a GPU-accelerated, adaptive ray tracing radiative transfer (RT) scheme into the Nyx hydrodynamics code via AMReX abstractions. The method combines adaptive ray tracing with a novel forward source-merging algorithm and a photon-conserving geometric overlap correction, enabling self-consistent radiation-hydro evolution at exascale scales. Key contributions include a portable, filter-based RT implementation, robust handling of low-density neighbor cells, and production-scale demonstrations (up to 4096 GPUs on a grid) that show convergence in reionization history and Ly- forest flux. The significance lies in delivering physically faithful RT within large cosmological volumes at high resolution, facilitating systematic studies of reionization with realistic source populations and feedback in a scalable, architecture-agnostic framework.

Abstract

Numerical methods for radiative transfer play a key role in modern-day astrophysics and cosmology, including study of the inhomogeneous reionization process. In this context, ray tracing methods are well-regarded for accuracy but notorious for high computational cost. In this work, we extend the capabilities of the Nyx N-body / hydrodynamics code, coupling radiation to gravitational and gas dynamics. We formulate adaptive ray tracing as a novel series of filters and transformations that can be used with AMReX particle abstractions, simplifying implementation and enabling portability across Exascale GPU architectures. To address computational cost, we present a new algorithm for merging sources, which significantly accelerates computation once reionization is well underway. Furthermore, we develop a novel prescription for geometric overlap correction with low-density neighbor cells. We perform verification and validation against standard analytic and numerical test problems. Finally, we demonstrate scaling to up to 1024 nodes and 4096 GPUs running multiphysics cosmological simulations, with 4096^3 Eulerian gas cells, 4096^3 dark matter particles, and ray tracing on a 1024^3 coarse grid. For these full cosmological simulations, we demonstrate convergence in terms of reionization history and post-ionization Lyman-alpha forest flux.

Paper Structure

This paper contains 31 sections, 17 equations, 12 figures, 4 tables, 3 algorithms.

Figures (12)

  • Figure 1: Schematic diagram of our adaptive ray tracing implementation (Sec. \ref{['sec:flux_comp']}, Alg. \ref{['alg:trace_rays']}), ignoring domain boundaries, box boundaries and early ray termination. (a) At the source position, initialize 12 particles, each corresponding to a pixel on the HEALPix sphere at the 0th refinement level. (b) March and deposit flux until at least one stopping condition is satisfied. In this example, marching stops when rays reach the splitting radius $r_{\textrm{split}}$. (c) Filter particles in order to find the subset that is marked for splitting. Split each parent ray into four children, each representing a pixel at the next HEALPix refinement level on a sphere of radius $r_{\textrm{split}}$.
  • Figure 2: Example ray traversal (Sec. \ref{['sec:flux_comp']}, Alg. \ref{['alg:trace_rays']}) that illustrates handling of subvolume boundaries and periodic BCs. (a) Rays are initialized like in Fig \ref{['fig:raysplit']}. (b) Raymarch until either a boundary is reached or the splitting criterion is satisfied. In the latter case, mark the ray for splitting. (c) Split the particles that are marked for splitting. (d) Call Redistribute() in order to send rays that have crossed a subvolume boundary to the correct MPI rank. (e) Raymarch again, like in (b). (f) Split rays again. In this example, one of the newly-created rays starts inside the subvolume owned by Rank 3, even though its parent was on Rank 1. (g) Call Redistribute() again. This time, in addition to handling rays that have crossed a subvolume boundary, rays that have crossed a domain boundary wrap around as a result of periodic BCs.
  • Figure 3: Comparison of "Test 4" results (Sec. \ref{['sec:realisticIC']}) given different solutions to the low $n_H$ edge case described in Sec. \ref{['sec:noise_fix_method']}.
  • Figure 4: Different approaches to handling the low $n_H$ edge case (Sec. \ref{['sec:noise_fix_method']}), illustrated with a Stromgren sphere test.
  • Figure 5: Illustration of a single timestep in our coupled simulation pipeline. (Sec \ref{['sec:coupled_method']}.)
  • ...and 7 more figures