Table of Contents
Fetching ...

Numerical solutions of fixed points in two-dimensional Kuramoto-Sivashinsky equation expedited by reinforcement learning

Juncheng Jiang, Dongdong Wan, Mengqi Zhang

TL;DR

The paper tackles the challenge of locating fixed points in the high-dimensional, chaotic two-dimensional Kuramoto-Sivashinsky equation (KSE). It proposes a hybrid method that couples Jacobian-Free Newton-Krylov (JFNK) iterations with deep reinforcement learning (DRL), using DRL to generate high-quality initial guesses and to empower control tasks that navigate trajectories between fixed points. The approach yields the discovery of over 300 fixed points and demonstrates substantial reductions in JFNK iterations when starting from DRL-informed initials, highlighting the practical efficiency gains for PDE-constrained fixed-point problems. The methodology, underpinned by a spectral discretization and a carefully designed reward and exploration-noise scheme within the Deep Deterministic Policy Gradient (DDPG) framework, offers a transferable framework for solving fixed-point and stability problems in other high-dimensional dynamical systems.

Abstract

This paper presents a combined approach to enhancing the effectiveness of Jacobian-Free Newton-Krylov (JFNK) method by deep reinforcement learning (DRL) in identifying fixed points within the 2D Kuramoto-Sivashinsky Equation (KSE). JFNK approach entails a good initial guess for improved convergence when searching for fixed points. With a properly defined reward function, we utilise DRL as a preliminary step to enhance the initial guess in the converging process. We report new results of fixed points in the 2D KSE which have not been reported in the literature. Additionally, we explored control optimization for the 2D KSE to navigate the system trajectories between known fixed points, based on parallel reinforcement learning techniques. This combined method underscores the improved JFNK approach to finding new fixed-point solutions within the context of 2D KSE, which may be instructive for other high-dimensional dynamical systems.

Numerical solutions of fixed points in two-dimensional Kuramoto-Sivashinsky equation expedited by reinforcement learning

TL;DR

The paper tackles the challenge of locating fixed points in the high-dimensional, chaotic two-dimensional Kuramoto-Sivashinsky equation (KSE). It proposes a hybrid method that couples Jacobian-Free Newton-Krylov (JFNK) iterations with deep reinforcement learning (DRL), using DRL to generate high-quality initial guesses and to empower control tasks that navigate trajectories between fixed points. The approach yields the discovery of over 300 fixed points and demonstrates substantial reductions in JFNK iterations when starting from DRL-informed initials, highlighting the practical efficiency gains for PDE-constrained fixed-point problems. The methodology, underpinned by a spectral discretization and a carefully designed reward and exploration-noise scheme within the Deep Deterministic Policy Gradient (DDPG) framework, offers a transferable framework for solving fixed-point and stability problems in other high-dimensional dynamical systems.

Abstract

This paper presents a combined approach to enhancing the effectiveness of Jacobian-Free Newton-Krylov (JFNK) method by deep reinforcement learning (DRL) in identifying fixed points within the 2D Kuramoto-Sivashinsky Equation (KSE). JFNK approach entails a good initial guess for improved convergence when searching for fixed points. With a properly defined reward function, we utilise DRL as a preliminary step to enhance the initial guess in the converging process. We report new results of fixed points in the 2D KSE which have not been reported in the literature. Additionally, we explored control optimization for the 2D KSE to navigate the system trajectories between known fixed points, based on parallel reinforcement learning techniques. This combined method underscores the improved JFNK approach to finding new fixed-point solutions within the context of 2D KSE, which may be instructive for other high-dimensional dynamical systems.
Paper Structure (18 sections, 11 equations, 14 figures, 2 tables)

This paper contains 18 sections, 11 equations, 14 figures, 2 tables.

Figures (14)

  • Figure 1: Sensor distribution (a) and force distribution (b) in the domain of 2D KSE. The color in panel (a) is dependent variable $\phi$ and red points denote sensors. In panel (b), the color represents the external force term of the 2D KSE. For enhanced visualization, we have set $\sigma=0.6$ and standardized all action vector amplitudes $u_{ij}=1$.
  • Figure 2: The steady state of the 2D KSE. (a) the result generated by our numerical code. (b) the result reported in Kalogirou et al.2DKSE. In both figures, the $x$-axis and $y$-axis span a range of $2\pi$, while the $z$-axis represents the dependent variable, which is denoted as $v$ in the reference.
  • Figure 3: DRL schematic diagram. In the current DRL method, the environment is simulated by the 2D KSE. The state and action input are used to generate the subsequent state in the next time-step. The actor and target actor networks generate an action based on the current state. The critic and target critic networks evaluate the action's quality by estimating the Q-value based on the current state and action. This evaluation is then used to update the actor network. The replay buffer stores transitions of experiences that allow for efficient, batched updates of the actor and critic networks.
  • Figure 4: Convergence/divergence of RL+JFNK and JFNK methods. (a) blue point $E_i$: the initial point; red points: the DRL-assisted initial guesses obtained through exploration starting from the initial point by employing the DRL algorithms; green points: fixed points obtained through JFNK method starting from the DRL-based initial guesses. (b) trajectories of the JFNK method from $E_i$ without the assistance of DRL. (c) log relative residual of the JFNK method.
  • Figure 5: (a) The process for DRL agent to find a potential DRL initial guess. Blue trajectory: the evolution of the DRL agent's exploration of the 2D KSE environment over the course of an episode. Red point: the state with the highest reward found during this exploration. (b) the evolution of the DRL state in the Fourier space (blue trajectory). Red point $E_1$: random initial guess; red point $E_2$: the state with the maximum reward.
  • ...and 9 more figures