Table of Contents
Fetching ...

COmoving Computer Acceleration (COCA): $N$-body simulations in an emulated frame of reference

Deaglan J. Bartlett, Marco Chiarenza, Ludvig Doeser, Florent Leclercq

TL;DR

This work addresses the high computational cost of $N$-body cosmological simulations by introducing COmoving Computer Acceleration (COCA), a hybrid framework that solves dynamics in an ML-augmented emulated frame of reference. By predicting a frame of reference with machine-learned corrections and solving for the residual trajectory using a small number of force evaluations, COCA retains the correct equations of motion and guarantees convergence to the true solution as $n_f$ grows, effectively combining speed with physical fidelity. In tests using a PM solver, COCA achieves percent-level accuracy in density and velocity statistics with far fewer force evaluations than COLA, and demonstrates robustness to cosmological parameter misspecification (ML-safety). The approach is general and extensible to other gravity solvers and can serve as a differentiable forward model for Bayesian inference, while offering practical paths to scaling via tiling and on-the-fly emulation.

Abstract

$N$-body simulations are computationally expensive, so machine-learning (ML)-based emulation techniques have emerged as a way to increase their speed. Although fast, surrogate models have limited trustworthiness due to potentially substantial emulation errors that current approaches cannot correct for. To alleviate this problem, we introduce COmoving Computer Acceleration (COCA), a hybrid framework interfacing ML with an $N$-body simulator. The correct physical equations of motion are solved in an emulated frame of reference, so that any emulation error is corrected by design. This approach corresponds to solving for the perturbation of particle trajectories around the machine-learnt solution, which is computationally cheaper than obtaining the full solution, yet is guaranteed to converge to the truth as one increases the number of force evaluations. Although applicable to any ML algorithm and $N$-body simulator, this approach is assessed in the particular case of particle-mesh cosmological simulations in a frame of reference predicted by a convolutional neural network, where the time dependence is encoded as an additional input parameter to the network. COCA efficiently reduces emulation errors in particle trajectories, requiring far fewer force evaluations than running the corresponding simulation without ML. We obtain accurate final density and velocity fields for a reduced computational budget. We demonstrate that this method shows robustness when applied to examples outside the range of the training data. When compared to the direct emulation of the Lagrangian displacement field using the same training resources, COCA's ability to correct emulation errors results in more accurate predictions. COCA makes $N$-body simulations cheaper by skipping unnecessary force evaluations, while still solving the correct equations of motion and correcting for emulation errors made by ML.

COmoving Computer Acceleration (COCA): $N$-body simulations in an emulated frame of reference

TL;DR

This work addresses the high computational cost of -body cosmological simulations by introducing COmoving Computer Acceleration (COCA), a hybrid framework that solves dynamics in an ML-augmented emulated frame of reference. By predicting a frame of reference with machine-learned corrections and solving for the residual trajectory using a small number of force evaluations, COCA retains the correct equations of motion and guarantees convergence to the true solution as grows, effectively combining speed with physical fidelity. In tests using a PM solver, COCA achieves percent-level accuracy in density and velocity statistics with far fewer force evaluations than COLA, and demonstrates robustness to cosmological parameter misspecification (ML-safety). The approach is general and extensible to other gravity solvers and can serve as a differentiable forward model for Bayesian inference, while offering practical paths to scaling via tiling and on-the-fly emulation.

Abstract

-body simulations are computationally expensive, so machine-learning (ML)-based emulation techniques have emerged as a way to increase their speed. Although fast, surrogate models have limited trustworthiness due to potentially substantial emulation errors that current approaches cannot correct for. To alleviate this problem, we introduce COmoving Computer Acceleration (COCA), a hybrid framework interfacing ML with an -body simulator. The correct physical equations of motion are solved in an emulated frame of reference, so that any emulation error is corrected by design. This approach corresponds to solving for the perturbation of particle trajectories around the machine-learnt solution, which is computationally cheaper than obtaining the full solution, yet is guaranteed to converge to the truth as one increases the number of force evaluations. Although applicable to any ML algorithm and -body simulator, this approach is assessed in the particular case of particle-mesh cosmological simulations in a frame of reference predicted by a convolutional neural network, where the time dependence is encoded as an additional input parameter to the network. COCA efficiently reduces emulation errors in particle trajectories, requiring far fewer force evaluations than running the corresponding simulation without ML. We obtain accurate final density and velocity fields for a reduced computational budget. We demonstrate that this method shows robustness when applied to examples outside the range of the training data. When compared to the direct emulation of the Lagrangian displacement field using the same training resources, COCA's ability to correct emulation errors results in more accurate predictions. COCA makes -body simulations cheaper by skipping unnecessary force evaluations, while still solving the correct equations of motion and correcting for emulation errors made by ML.
Paper Structure (24 sections, 39 equations, 13 figures)

This paper contains 24 sections, 39 equations, 13 figures.

Figures (13)

  • Figure 1: Schematic illustration of the (a) COLA and (b) COCA formalism for cosmological simulations. In COLA, one solves the equations of motion in the frame of reference given by LPT, so one computes the residual ("res") between the LPT trajectory and the true position $\textbf{x}$, of particles. In COCA, one emulates a frame of reference closer to the true trajectory by adding a ML contribution to LPT, so one solves for the (smaller) residuals between $\textbf{x}$ and the emulated frame.
  • Figure 2: Schematic illustration of the kick-drift-kick integration scheme employed in this study. The initial positions $\textbf{x}_\mathrm{i}$ and momenta $\textbf{p}_\mathrm{i}$ are evolved to their final values $\textbf{x}_\mathrm{f}$ and $\textbf{p}_\mathrm{f}$, with updates to these quantities occurring at different times. Unlike traditional kick-drift-kick integration, we choose not to evaluate forces that appear in the equations of motion at all time steps, but only at a subset (steps 8, 9 and 10 in this example). At all other kick steps, the momenta are updated according to the emulated frame of reference only.
  • Figure 3: Slice of the difference between the true momenta of particles $\textbf{p}$ and the LPT prediction $\textbf{p}_{\rm LPT}$, for a test simulation, as a function of scale factor $a$. We plot the component orthogonal to the chosen slice. At late times, the spatial structure of the field $\textbf{p} - \textbf{p}_{\rm LPT}$ remains relatively constant, with most of the time dependence being a simple multiplicative scaling.
  • Figure 4: Scaling of the residual momentum as a function of scale factor, as defined in \ref{['eq:momentum_scaling_definition']}. The points and error bars represent the mean and standard deviation, respectively, across the 100 training simulations. The curve represents the best fit, as given by \ref{['eq:momentum_scaling_fit']}.
  • Figure 5: Slices of the input, target, output, and error of the frame of reference emulator at the final time step (i.e., with style parameter $a=1$). The input is the (scalar) linear density field (first column). The emulator aims to predict the three components (one per row) of its target $\textbf{p} - \textbf{p}_{\rm LPT}$ (second column). The emulator's predictions are shown in the third column, and the emulation error $\textbf{p}_\mathrm{res} = \textbf{p}-\textbf{p}_\mathrm{LPT}-\textbf{p}_\mathrm{ML}$ is shown in the final column.
  • ...and 8 more figures