Table of Contents
Fetching ...

Plasma State Monitoring and Disruption Characterization using Multimodal VAEs

Yoeri Poels, Alessandro Pau, Christian Donner, Giulio Romanelli, Olivier Sauter, Cristina Venturini, Vlado Menkovski, the TCV team, the WPTE team

TL;DR

This work tackles disruption monitoring in tokamaks by learning an interpretable, low-dimensional representation of plasma states through a sequential multimodal VAE. The model maps high-dimensional discharge time series into a 2D latent trajectory, organized by a Gaussian-mixture prior to reveal distinct operating regimes, and introduces a disruption-risk map $D_{\text{risk}}(\boldsymbol{z})$ that correlates with disruption likelihood and disruptivity. Trained on ~1600 TCV discharges during flat-top operation, the approach yields a latent space that aligns with known operational limits, distinguishes disruption types, and enables counterfactual analyses to identify disruption-related parameters. The work demonstrates potential for improved understanding and analysis of disruptions, with implications for diagnostics and control strategies in future devices; it also outlines avenues for extending the method to faster timescales, higher-dimensional latents, and multi-machine deployments.

Abstract

When a plasma disrupts in a tokamak, significant heat and electromagnetic loads are deposited onto the surrounding device components. These forces scale with plasma current and magnetic field strength, making disruptions one of the key challenges for future devices. Unfortunately, disruptions are not fully understood, with many different underlying causes that are difficult to anticipate. Data-driven models have shown success in predicting them, but they only provide limited interpretability. On the other hand, large-scale statistical analyses have been a great asset to understanding disruptive patterns. In this paper, we leverage data-driven methods to find an interpretable representation of the plasma state for disruption characterization. Specifically, we use a latent variable model to represent diagnostic measurements as a low-dimensional, latent representation. We build upon the Variational Autoencoder (VAE) framework, and extend it for (1) continuous projections of plasma trajectories; (2) a multimodal structure to separate operating regimes; and (3) separation with respect to disruptive regimes. Subsequently, we can identify continuous indicators for the disruption rate and the disruptivity based on statistical properties of measurement data. The proposed method is demonstrated using a dataset of approximately 1600 TCV discharges, selecting for flat-top disruptions or regular terminations. We evaluate the method with respect to (1) the identified disruption risk and its correlation with other plasma properties; (2) the ability to distinguish different types of disruptions; and (3) downstream analyses. For the latter, we conduct a demonstrative study on identifying parameters connected to disruptions using counterfactual-like analysis. Overall, the method can adequately identify distinct operating regimes characterized by varying proximity to disruptions in an interpretable manner.

Plasma State Monitoring and Disruption Characterization using Multimodal VAEs

TL;DR

This work tackles disruption monitoring in tokamaks by learning an interpretable, low-dimensional representation of plasma states through a sequential multimodal VAE. The model maps high-dimensional discharge time series into a 2D latent trajectory, organized by a Gaussian-mixture prior to reveal distinct operating regimes, and introduces a disruption-risk map that correlates with disruption likelihood and disruptivity. Trained on ~1600 TCV discharges during flat-top operation, the approach yields a latent space that aligns with known operational limits, distinguishes disruption types, and enables counterfactual analyses to identify disruption-related parameters. The work demonstrates potential for improved understanding and analysis of disruptions, with implications for diagnostics and control strategies in future devices; it also outlines avenues for extending the method to faster timescales, higher-dimensional latents, and multi-machine deployments.

Abstract

When a plasma disrupts in a tokamak, significant heat and electromagnetic loads are deposited onto the surrounding device components. These forces scale with plasma current and magnetic field strength, making disruptions one of the key challenges for future devices. Unfortunately, disruptions are not fully understood, with many different underlying causes that are difficult to anticipate. Data-driven models have shown success in predicting them, but they only provide limited interpretability. On the other hand, large-scale statistical analyses have been a great asset to understanding disruptive patterns. In this paper, we leverage data-driven methods to find an interpretable representation of the plasma state for disruption characterization. Specifically, we use a latent variable model to represent diagnostic measurements as a low-dimensional, latent representation. We build upon the Variational Autoencoder (VAE) framework, and extend it for (1) continuous projections of plasma trajectories; (2) a multimodal structure to separate operating regimes; and (3) separation with respect to disruptive regimes. Subsequently, we can identify continuous indicators for the disruption rate and the disruptivity based on statistical properties of measurement data. The proposed method is demonstrated using a dataset of approximately 1600 TCV discharges, selecting for flat-top disruptions or regular terminations. We evaluate the method with respect to (1) the identified disruption risk and its correlation with other plasma properties; (2) the ability to distinguish different types of disruptions; and (3) downstream analyses. For the latter, we conduct a demonstrative study on identifying parameters connected to disruptions using counterfactual-like analysis. Overall, the method can adequately identify distinct operating regimes characterized by varying proximity to disruptions in an interpretable manner.

Paper Structure

This paper contains 19 sections, 32 equations, 23 figures, 8 tables.

Figures (23)

  • Figure 1: The distribution of the discharges' dates, binned per quartile. Discharges range from TCV #51325 to #81751, with the majority taking place between 2016-2019.
  • Figure 2: Distributions of key plasma parameters in the dataset. The plotted values are averages over phases of 20ms--around the TCV energy confinement time--to exclude transient states.
  • Figure 3: Schematic overview of the model structure. The model consists of encoder distribution $q_\phi(\mathbf{z}^{t_i}|\boldsymbol{\mu}^{t_{i-s}}, \mathbf{x}^{t_{i-w}:t_i})$ (blue), decoder distribution $p_\theta(\mathbf{x}^{t_i}|\mathbf{z}^{t_i})$ (red) and disruption risk map $D_{\textit{risk}}(\mathbf{z}^{t_i})$ (green). Signals are encoded using timewindows of input signals, with the location in the latent space being computed as an update w.r.t. the previous location. Conversely, the mapping from latent space to data space and to the disruption risk are static in time.
  • Figure 4: Depiction of the training and inference procedure of the proposed method. Data $\mathbf{x}$, time series of signals corresponding to TCV experiments, are projected to latent variable $\mathbf{z}$ using approximate posterior, the encoder, $q_\phi(\mathbf{z}^{t_m}|\boldsymbol{\mu}^{t_{m-s}},\mathbf{x}^{t_{m-w}:t_m})$ (or $q_\phi(\mathbf{z}|\mathbf{x})$ aggregating over time). The generative distribution, the decoder, $p_{\theta}(\mathbf{x}^{t_m}|\mathbf{z}^{t_m})$ (or $p_\theta(\mathbf{x}|\mathbf{z})$ aggregating over time) provides the map back to data space. A timeslice in data space corresponds to a timeslice in latent space, consequently projecting discharges as latent trajectories. Simultaneously, we learn disruption risk map $D_{\textit{risk}}$ as a function of $\mathbf{z}$, using proxy labels $y$. The latent space is optimized to maximize the data likelihood ($\mathcal{L}_\textit{rec}$), match disruption information ($\mathcal{L}_{\textit{risk}}$), minimize divergence to prior modes ($\mathcal{L}_{\textit{KL}}$) while covering all modes of said prior ($\mathcal{L}_\textit{uniform}$).
  • Figure 5: Precomputed deformation of a uniform space, used to ease the task of placing appropriate probability density on the prior modes during model training.
  • ...and 18 more figures