Full Shot Predictions for the DIII-D Tokamak via Deep Recurrent Networks
Ian Char, Youngseog Chung, Joseph Abbate, Egemen Kolemen, Jeff Schneider
TL;DR
This work tackles the problem of forecasting the full time evolution of tokamak plasmas by learning a data-driven dynamics model from 7,884 DIII-D shots. It uses a encoder–GRU–decoder architecture that predicts state changes at 25 ms intervals, with dual Gaussian heads to yield mean and variance via a negative log-likelihood objective, enabling predictive uncertainty quantification. Through extensive ablations, the authors demonstrate the benefits of ensemble-based distributional predictions, compare GRU, LSTM, and MLP recurrent units, and show that distributional outputs improve long-horizon accuracy and calibration relative to point predictions. The results show calibrated, long-horizon forecasts across multiple plasma diagnostics, highlighting potential for data-driven control and actuator optimization in fusion devices.
Abstract
Although tokamaks are one of the most promising devices for realizing nuclear fusion as an energy source, there are still key obstacles when it comes to understanding the dynamics of the plasma and controlling it. As such, it is crucial that high quality models are developed to assist in overcoming these obstacles. In this work, we take an entirely data driven approach to learn such a model. In particular, we use historical data from the DIII-D tokamak to train a deep recurrent network that is able to predict the full time evolution of plasma discharges (or "shots"). Following this, we investigate how different training and inference procedures affect the quality and calibration of the shot predictions.
