Table of Contents
Fetching ...

GyroSwin: 5D Surrogates for Gyrokinetic Plasma Turbulence Simulations

Fabian Paischer, Gianluca Galletti, William Hornsby, Paul Setinek, Lorenzo Zanisi, Naomi Carey, Stanislas Pamela, Johannes Brandstetter

TL;DR

The paper tackles the computational bottleneck of simulating nonlinear gyrokinetic turbulence in fusion plasmas by introducing GyroSwin, a scalable $5D$ neural surrogate built on a multitask 5D Swin Transformer. It features 5D shifted window attention, latent $5D\leftrightarrow3D$ mixing via cross-attention and integrator modules, and channelwise zonal-flow bias to capture essential nonlinear physics; trained to predict the full $5D$ distribution function $f$ as well as downstream quantities $\phi$ and $Q$. GyroSwin outperforms reduced-order models on heat-flux and turbulence diagnostics, reproduces the energy cascade, and reduces computation by about three orders of magnitude, with scalable regimes up to around $10^9$ parameters and verified zonal-flow dynamics. Limitations include modeling turbulence as a distributional process with potential error accumulation, excluding the linear phase, and restricting to the adiabatic-electron approximation; future work points to distributional modelling, linear-phase extension, and higher-fidelity electron models via transfer learning.

Abstract

Nuclear fusion plays a pivotal role in the quest for reliable and sustainable energy production. A major roadblock to viable fusion power is understanding plasma turbulence, which significantly impairs plasma confinement, and is vital for next-generation reactor design. Plasma turbulence is governed by the nonlinear gyrokinetic equation, which evolves a 5D distribution function over time. Due to its high computational cost, reduced-order models are often employed in practice to approximate turbulent transport of energy. However, they omit nonlinear effects unique to the full 5D dynamics. To tackle this, we introduce GyroSwin, the first scalable 5D neural surrogate that can model 5D nonlinear gyrokinetic simulations, thereby capturing the physical phenomena neglected by reduced models, while providing accurate estimates of turbulent heat transport. GyroSwin (i) extends hierarchical Vision Transformers to 5D, (ii) introduces cross-attention and integration modules for latent 3D$\leftrightarrow$5D interactions between electrostatic potential fields and the distribution function, and (iii) performs channelwise mode separation inspired by nonlinear physics. We demonstrate that GyroSwin outperforms widely used reduced numerics on heat flux prediction, captures the turbulent energy cascade, and reduces the cost of fully resolved nonlinear gyrokinetics by three orders of magnitude while remaining physically verifiable. GyroSwin shows promising scaling laws, tested up to one billion parameters, paving the way for scalable neural surrogates for gyrokinetic simulations of plasma turbulence.

GyroSwin: 5D Surrogates for Gyrokinetic Plasma Turbulence Simulations

TL;DR

The paper tackles the computational bottleneck of simulating nonlinear gyrokinetic turbulence in fusion plasmas by introducing GyroSwin, a scalable neural surrogate built on a multitask 5D Swin Transformer. It features 5D shifted window attention, latent mixing via cross-attention and integrator modules, and channelwise zonal-flow bias to capture essential nonlinear physics; trained to predict the full distribution function as well as downstream quantities and . GyroSwin outperforms reduced-order models on heat-flux and turbulence diagnostics, reproduces the energy cascade, and reduces computation by about three orders of magnitude, with scalable regimes up to around parameters and verified zonal-flow dynamics. Limitations include modeling turbulence as a distributional process with potential error accumulation, excluding the linear phase, and restricting to the adiabatic-electron approximation; future work points to distributional modelling, linear-phase extension, and higher-fidelity electron models via transfer learning.

Abstract

Nuclear fusion plays a pivotal role in the quest for reliable and sustainable energy production. A major roadblock to viable fusion power is understanding plasma turbulence, which significantly impairs plasma confinement, and is vital for next-generation reactor design. Plasma turbulence is governed by the nonlinear gyrokinetic equation, which evolves a 5D distribution function over time. Due to its high computational cost, reduced-order models are often employed in practice to approximate turbulent transport of energy. However, they omit nonlinear effects unique to the full 5D dynamics. To tackle this, we introduce GyroSwin, the first scalable 5D neural surrogate that can model 5D nonlinear gyrokinetic simulations, thereby capturing the physical phenomena neglected by reduced models, while providing accurate estimates of turbulent heat transport. GyroSwin (i) extends hierarchical Vision Transformers to 5D, (ii) introduces cross-attention and integration modules for latent 3D5D interactions between electrostatic potential fields and the distribution function, and (iii) performs channelwise mode separation inspired by nonlinear physics. We demonstrate that GyroSwin outperforms widely used reduced numerics on heat flux prediction, captures the turbulent energy cascade, and reduces the cost of fully resolved nonlinear gyrokinetics by three orders of magnitude while remaining physically verifiable. GyroSwin shows promising scaling laws, tested up to one billion parameters, paving the way for scalable neural surrogates for gyrokinetic simulations of plasma turbulence.

Paper Structure

This paper contains 16 sections, 21 equations, 11 figures, 5 tables.

Figures (11)

  • Figure 1: Left: GyroSwin models the 5D distribution function of nonlinear gyrokinetics and incorporates integration blocks to predict 3D electrostatic potential fields and scalar heat flux. Right: ROMs (quasilinear) solve a cartesian product of 2D modes in spectral space and 3D fields. Furthermore. They rely on saturation rules to approximate the nonlinear flux spectrum.
  • Figure 2: Left: GyroSwin receives the 5D distribution function as input and predicts the evolved 5D distribution function, as well as the respective 3D potential and heat flux at the next timestep. Right: Essential building blocks and integrator layers that enable multitask training. The latent 5D space is integrated over velocity space to obtain a latent 3D field for potential prediction via cross-attention.
  • Figure 3: Scaling GyroSwin to $\sim$1B parameters trained on 241 simulations amounting to approximately 6TB of data. We show train/validation error for predicting the 5D distribution function (left) and the 3D electrostatic potential field (right).
  • Figure 4: Left:$W(k_y)$ averaged over time and OOD simulations for different 5D neural surrogates. Competitors tend to underestimate while GyroSwin matches the spectrum well with a slight discrepancy on higher frequencies. Right: Time-averaged zonal flow profile for a slice along $s$ across radial coordinates $x$ for a selected OOD simulation. GyroSwin captures the zonal flow profile.
  • Figure 5: Distribution of input parameters $\hat{s}$, $q$, $R/L_n$, and $R/L_t$ along with average heat flux $\bar{Q}$. The sampled parameter space is evenly distributed.
  • ...and 6 more figures