Table of Contents
Fetching ...

Design Optimization of Nuclear Fusion Reactor through Deep Reinforcement Learning

Jinsu Kim, Jaemin Seo

TL;DR

This work tackles the challenge of designing a steady-state tokamak reactor under multiple operational constraints (density, beta, kink instability, bootstrap fraction) while minimizing cost. It proposes a Deep Reinforcement Learning framework using Proximal Policy Optimization (PPO) to optimize reactor design by scalarizing multiple objectives into a reward signal, trained against a custom design computation environment. The approach demonstrates that DRL can find cost-reduced designs that satisfy all steady-state constraints, outperforming grid-search in efficiency and revealing multiple viable operating regimes; for example, DRL found a design with $Q\approx6.03$ while reference and grid-search designs reached higher $Q$ values, indicating trade-offs between cost and confinement. Overall, the framework offers a scalable, parallelizable method for multi-objective tokamak design optimization with potential for extension to material and profile-shape considerations, reducing computational costs for conceptual reactor design.

Abstract

This research explores the application of Deep Reinforcement Learning (DRL) to optimize the design of a nuclear fusion reactor. DRL can efficiently address the challenging issues attributed to multiple physics and engineering constraints for steady-state operation. The fusion reactor design computation and the optimization code applicable to parallelization with DRL are developed. The proposed framework enables finding the optimal reactor design that satisfies the operational requirements while reducing building costs. Multi-objective design optimization for a fusion reactor is now simplified by DRL, indicating the high potential of the proposed framework for advancing the efficient and sustainable design of future reactors.

Design Optimization of Nuclear Fusion Reactor through Deep Reinforcement Learning

TL;DR

This work tackles the challenge of designing a steady-state tokamak reactor under multiple operational constraints (density, beta, kink instability, bootstrap fraction) while minimizing cost. It proposes a Deep Reinforcement Learning framework using Proximal Policy Optimization (PPO) to optimize reactor design by scalarizing multiple objectives into a reward signal, trained against a custom design computation environment. The approach demonstrates that DRL can find cost-reduced designs that satisfy all steady-state constraints, outperforming grid-search in efficiency and revealing multiple viable operating regimes; for example, DRL found a design with while reference and grid-search designs reached higher values, indicating trade-offs between cost and confinement. Overall, the framework offers a scalable, parallelizable method for multi-objective tokamak design optimization with potential for extension to material and profile-shape considerations, reducing computational costs for conceptual reactor design.

Abstract

This research explores the application of Deep Reinforcement Learning (DRL) to optimize the design of a nuclear fusion reactor. DRL can efficiently address the challenging issues attributed to multiple physics and engineering constraints for steady-state operation. The fusion reactor design computation and the optimization code applicable to parallelization with DRL are developed. The proposed framework enables finding the optimal reactor design that satisfies the operational requirements while reducing building costs. Multi-objective design optimization for a fusion reactor is now simplified by DRL, indicating the high potential of the proposed framework for advancing the efficient and sustainable design of future reactors.
Paper Structure (11 sections, 20 equations, 8 figures, 8 tables, 2 algorithms)

This paper contains 11 sections, 20 equations, 8 figures, 8 tables, 2 algorithms.

Figures (8)

  • Figure 1: A diagram for a tokamak reactor design computation based on nuclear physics and engineering constraints. A design is determined considering the maximum allowable current density and stress on materials in addition to neutron radiation wall load.
  • Figure 2: The proposed design optimization process based on deep reinforcement learning for finding the optimal design configuration of a tokamak reactor is represented.
  • Figure 3: The policy loss curve (left) and total reward change (right) during the optimization process.
  • Figure 4: The partial reward change for each plasma parameter during optimization. (a) Plasma beta $\beta$, (b) cost parameter, (c) average density $\bar{n}_e$, (d) bootstrap current ratio $f_{bs}$, (e) safety factor $q$, (f) energy confinement time $\tau$.
  • Figure 5: Comparison of a total reward change during the optimization through grid search and DRL.
  • ...and 3 more figures