Table of Contents
Fetching ...

Differentiable Programming for Computational Plasma Physics

Nick McGreivy

TL;DR

This thesis introduces a stellarator coil design code (FOCUSADD) that uses gradient-based optimization to produce stellarator coils with finite build and introduces error-correcting algorithms that preserve invariants of time-dependent PDEs.

Abstract

Differentiable programming allows for derivatives of functions implemented via computer code to be calculated automatically. These derivatives are calculated using automatic differentiation (AD). This thesis explores two applications of differentiable programming to computational plasma physics. First, we consider how differentiable programming can be used to simplify and improve stellarator optimization. We introduce a stellarator coil design code (FOCUSADD) that uses gradient-based optimization to produce stellarator coils with finite build. Because we use reverse mode AD, which can compute gradients of scalar functions with the same computational complexity as the function, FOCUSADD is simple, flexible, and efficient. We then discuss two additional applications of AD in stellarator optimization. Second, we explore how machine learning (ML) can be used to improve or replace the numerical methods used to solve partial differential equations (PDEs), focusing on time-dependent PDEs in fluid mechanics relevant to plasma physics. Differentiable programming allows neural networks and other techniques from ML to be embedded within numerical methods. This is a promising, but relatively new, research area. We focus on two basic questions. First, can we design ML-based PDE solvers that have the same guarantees of conservation, stability, and positivity that standard numerical methods do? The answer is yes; we introduce error-correcting algorithms that preserve invariants of time-dependent PDEs. Second, which types of ML-based solvers work best at solving PDEs? We perform a systematic review of the scientific literature on solving PDEs with ML. Unfortunately we discover two issues, weak baselines and reporting biases, that affect the interpretation reproducibility of a significant majority of published research. We conclude that using ML to solve PDEs is not as promising as we initially believed.

Differentiable Programming for Computational Plasma Physics

TL;DR

This thesis introduces a stellarator coil design code (FOCUSADD) that uses gradient-based optimization to produce stellarator coils with finite build and introduces error-correcting algorithms that preserve invariants of time-dependent PDEs.

Abstract

Differentiable programming allows for derivatives of functions implemented via computer code to be calculated automatically. These derivatives are calculated using automatic differentiation (AD). This thesis explores two applications of differentiable programming to computational plasma physics. First, we consider how differentiable programming can be used to simplify and improve stellarator optimization. We introduce a stellarator coil design code (FOCUSADD) that uses gradient-based optimization to produce stellarator coils with finite build. Because we use reverse mode AD, which can compute gradients of scalar functions with the same computational complexity as the function, FOCUSADD is simple, flexible, and efficient. We then discuss two additional applications of AD in stellarator optimization. Second, we explore how machine learning (ML) can be used to improve or replace the numerical methods used to solve partial differential equations (PDEs), focusing on time-dependent PDEs in fluid mechanics relevant to plasma physics. Differentiable programming allows neural networks and other techniques from ML to be embedded within numerical methods. This is a promising, but relatively new, research area. We focus on two basic questions. First, can we design ML-based PDE solvers that have the same guarantees of conservation, stability, and positivity that standard numerical methods do? The answer is yes; we introduce error-correcting algorithms that preserve invariants of time-dependent PDEs. Second, which types of ML-based solvers work best at solving PDEs? We perform a systematic review of the scientific literature on solving PDEs with ML. Unfortunately we discover two issues, weak baselines and reporting biases, that affect the interpretation reproducibility of a significant majority of published research. We conclude that using ML to solve PDEs is not as promising as we initially believed.

Paper Structure

This paper contains 166 sections, 311 equations, 35 figures, 2 tables, 1 algorithm.

Figures (35)

  • Figure 1: (a) Cross sections of fusion reactions as a function of the center of mass (COM) energy. (b) Reactivities $\langle \sigma v \rangle$ of fusion reactions, assuming that the plasma is in thermodynamic equilibrium. (c) The fusion reaction rate $R_{ij}$ divided by the squared plasma pressure $p^2$, assuming an optimal fuel ratio. (d) The fusion power density (in megawatts per cubic meter) per unit of squared plasma pressure for various fusion reactions. Based on ENDF data from the IAEA.
  • Figure 2: The number of papers published per month in the arXiv categories of AI and ML is growing exponentially. From krenn2023forecasting. Obtained via CC BY license.
  • Figure 3: (a) The computational graph for $f(\bm x) = a(\bm x) + b(\bm x)$. Here $u_1 = a(\bm x)$ and $u_2 = b(\bm x)$ are intermediate variables corresponding to the output of the primitive operations $a$ and $b$, while $y = u_1 + u_2$ is an output variable corresponding to the output of the primitive operation add. (b) The linearized computational graph for $f$. Here the edge weight on an edge from node $v_i$ to $v_j$ represents the value of the partial derivative $\frac{\partial v_j}{\partial v_i}$.
  • Figure 4: Illustration of the basic concepts of the linearized computational graph (LCG) and Bauer's formula. (a) pseudocode for a simple function with intermediate variables; (b) the primal computational graph, a DAG with variables as vertices and flow moving upwards to the output; (c) the linearized computational graph (LCG) in which the edges are labeled with the values of the local derivatives; (d) illustration of the four paths that must be evaluated to compute the Jacobian. (Example from Paul D. Hovland.)
  • Figure 5: (a) The underlying computational graph for \ref{['eq:ch2-dp-ex']}. A comparison with \ref{['eq:ch2-dp-ex-cr']} reveals that the derivative $\frac{\partial y}{\partial x}$ equals the sum over paths of the products of the edge weights in the linearized computational graph. (b) A vectorized representation of the computational graph, simplifying the presentation.
  • ...and 30 more figures