Numerical solutions of fixed points in two-dimensional Kuramoto-Sivashinsky equation expedited by reinforcement learning

Juncheng Jiang; Dongdong Wan; Mengqi Zhang

Numerical solutions of fixed points in two-dimensional Kuramoto-Sivashinsky equation expedited by reinforcement learning

Juncheng Jiang, Dongdong Wan, Mengqi Zhang

TL;DR

The paper tackles the challenge of locating fixed points in the high-dimensional, chaotic two-dimensional Kuramoto-Sivashinsky equation (KSE). It proposes a hybrid method that couples Jacobian-Free Newton-Krylov (JFNK) iterations with deep reinforcement learning (DRL), using DRL to generate high-quality initial guesses and to empower control tasks that navigate trajectories between fixed points. The approach yields the discovery of over 300 fixed points and demonstrates substantial reductions in JFNK iterations when starting from DRL-informed initials, highlighting the practical efficiency gains for PDE-constrained fixed-point problems. The methodology, underpinned by a spectral discretization and a carefully designed reward and exploration-noise scheme within the Deep Deterministic Policy Gradient (DDPG) framework, offers a transferable framework for solving fixed-point and stability problems in other high-dimensional dynamical systems.

Abstract

This paper presents a combined approach to enhancing the effectiveness of Jacobian-Free Newton-Krylov (JFNK) method by deep reinforcement learning (DRL) in identifying fixed points within the 2D Kuramoto-Sivashinsky Equation (KSE). JFNK approach entails a good initial guess for improved convergence when searching for fixed points. With a properly defined reward function, we utilise DRL as a preliminary step to enhance the initial guess in the converging process. We report new results of fixed points in the 2D KSE which have not been reported in the literature. Additionally, we explored control optimization for the 2D KSE to navigate the system trajectories between known fixed points, based on parallel reinforcement learning techniques. This combined method underscores the improved JFNK approach to finding new fixed-point solutions within the context of 2D KSE, which may be instructive for other high-dimensional dynamical systems.

Numerical solutions of fixed points in two-dimensional Kuramoto-Sivashinsky equation expedited by reinforcement learning

TL;DR

Abstract

Paper Structure (18 sections, 11 equations, 14 figures, 2 tables)

This paper contains 18 sections, 11 equations, 14 figures, 2 tables.

Introduction
Problem formulation
Two-dimensional Kuramoto-Sivashinsky equation
Numerical methods
Numerical simulations and control setup for 2D KSE
Methodology and implementation of the DRL method
Reward design in DRL
Exploration noise in DRL
Results and discussion
Application of DRL-assisted JFNK iteration for the 2D KSE
DRL-enhanced initial conditions
A comparative test
Fixed points in 2D KSE
DRL-based navigation between fixed points
Conclusion
...and 3 more sections

Figures (14)

Figure 1: Sensor distribution (a) and force distribution (b) in the domain of 2D KSE. The color in panel (a) is dependent variable $\phi$ and red points denote sensors. In panel (b), the color represents the external force term of the 2D KSE. For enhanced visualization, we have set $\sigma=0.6$ and standardized all action vector amplitudes $u_{ij}=1$.
Figure 2: The steady state of the 2D KSE. (a) the result generated by our numerical code. (b) the result reported in Kalogirou et al.2DKSE. In both figures, the $x$-axis and $y$-axis span a range of $2\pi$, while the $z$-axis represents the dependent variable, which is denoted as $v$ in the reference.
Figure 3: DRL schematic diagram. In the current DRL method, the environment is simulated by the 2D KSE. The state and action input are used to generate the subsequent state in the next time-step. The actor and target actor networks generate an action based on the current state. The critic and target critic networks evaluate the action's quality by estimating the Q-value based on the current state and action. This evaluation is then used to update the actor network. The replay buffer stores transitions of experiences that allow for efficient, batched updates of the actor and critic networks.
Figure 4: Convergence/divergence of RL+JFNK and JFNK methods. (a) blue point $E_i$: the initial point; red points: the DRL-assisted initial guesses obtained through exploration starting from the initial point by employing the DRL algorithms; green points: fixed points obtained through JFNK method starting from the DRL-based initial guesses. (b) trajectories of the JFNK method from $E_i$ without the assistance of DRL. (c) log relative residual of the JFNK method.
Figure 5: (a) The process for DRL agent to find a potential DRL initial guess. Blue trajectory: the evolution of the DRL agent's exploration of the 2D KSE environment over the course of an episode. Red point: the state with the highest reward found during this exploration. (b) the evolution of the DRL state in the Fourier space (blue trajectory). Red point $E_1$: random initial guess; red point $E_2$: the state with the maximum reward.
...and 9 more figures

Numerical solutions of fixed points in two-dimensional Kuramoto-Sivashinsky equation expedited by reinforcement learning

TL;DR

Abstract

Numerical solutions of fixed points in two-dimensional Kuramoto-Sivashinsky equation expedited by reinforcement learning

Authors

TL;DR

Abstract

Table of Contents

Figures (14)