GyroSwin: 5D Surrogates for Gyrokinetic Plasma Turbulence Simulations
Fabian Paischer, Gianluca Galletti, William Hornsby, Paul Setinek, Lorenzo Zanisi, Naomi Carey, Stanislas Pamela, Johannes Brandstetter
TL;DR
The paper tackles the computational bottleneck of simulating nonlinear gyrokinetic turbulence in fusion plasmas by introducing GyroSwin, a scalable $5D$ neural surrogate built on a multitask 5D Swin Transformer. It features 5D shifted window attention, latent $5D\leftrightarrow3D$ mixing via cross-attention and integrator modules, and channelwise zonal-flow bias to capture essential nonlinear physics; trained to predict the full $5D$ distribution function $f$ as well as downstream quantities $\phi$ and $Q$. GyroSwin outperforms reduced-order models on heat-flux and turbulence diagnostics, reproduces the energy cascade, and reduces computation by about three orders of magnitude, with scalable regimes up to around $10^9$ parameters and verified zonal-flow dynamics. Limitations include modeling turbulence as a distributional process with potential error accumulation, excluding the linear phase, and restricting to the adiabatic-electron approximation; future work points to distributional modelling, linear-phase extension, and higher-fidelity electron models via transfer learning.
Abstract
Nuclear fusion plays a pivotal role in the quest for reliable and sustainable energy production. A major roadblock to viable fusion power is understanding plasma turbulence, which significantly impairs plasma confinement, and is vital for next-generation reactor design. Plasma turbulence is governed by the nonlinear gyrokinetic equation, which evolves a 5D distribution function over time. Due to its high computational cost, reduced-order models are often employed in practice to approximate turbulent transport of energy. However, they omit nonlinear effects unique to the full 5D dynamics. To tackle this, we introduce GyroSwin, the first scalable 5D neural surrogate that can model 5D nonlinear gyrokinetic simulations, thereby capturing the physical phenomena neglected by reduced models, while providing accurate estimates of turbulent heat transport. GyroSwin (i) extends hierarchical Vision Transformers to 5D, (ii) introduces cross-attention and integration modules for latent 3D$\leftrightarrow$5D interactions between electrostatic potential fields and the distribution function, and (iii) performs channelwise mode separation inspired by nonlinear physics. We demonstrate that GyroSwin outperforms widely used reduced numerics on heat flux prediction, captures the turbulent energy cascade, and reduces the cost of fully resolved nonlinear gyrokinetics by three orders of magnitude while remaining physically verifiable. GyroSwin shows promising scaling laws, tested up to one billion parameters, paving the way for scalable neural surrogates for gyrokinetic simulations of plasma turbulence.
