Table of Contents
Fetching ...

Suppressing Modulation Instability with Reinforcement Learning

Nikolay Kalmykov, Rishat Zagidullin, Oleg Rogov, Sergey Rykovanov, Dmitry V. Dylov

TL;DR

This work proposes an approach based on reinforcement learning to suppress the unstable modes by optimizing the parameters for the time modulation of the potential in the nonlinear system.

Abstract

Modulation instability is a phenomenon of spontaneous pattern formation in nonlinear media, oftentimes leading to an unpredictable behaviour and a degradation of a signal of interest. We propose an approach based on reinforcement learning to suppress the unstable modes by optimizing the parameters for the time modulation of the potential in the nonlinear system. We test our approach in 1D and 2D cases and propose a new class of physically-meaningful reward functions to guarantee tamed instability.

Suppressing Modulation Instability with Reinforcement Learning

TL;DR

This work proposes an approach based on reinforcement learning to suppress the unstable modes by optimizing the parameters for the time modulation of the potential in the nonlinear system.

Abstract

Modulation instability is a phenomenon of spontaneous pattern formation in nonlinear media, oftentimes leading to an unpredictable behaviour and a degradation of a signal of interest. We propose an approach based on reinforcement learning to suppress the unstable modes by optimizing the parameters for the time modulation of the potential in the nonlinear system. We test our approach in 1D and 2D cases and propose a new class of physically-meaningful reward functions to guarantee tamed instability.
Paper Structure (1 section, 12 equations, 4 figures)

This paper contains 1 section, 12 equations, 4 figures.

Figures (4)

  • Figure 1: Numerical solution of the unmodulated CGLE with $c=0.5$, $d=0$ in 1D. Inset: spatial spectrum. The graphs are logarithmically scaled in the $t$-axis. We see generation of unstable modes due to nonlinearity. Small perturbations at every mode are present due to noise in the initial conditions. Some modes are amplified by the nonlinearity, so they can collect the energy of the propagating signal.
  • Figure 2: Description of the Q-learning algorithm training results. Left: a snapshot in time of a 2D simulation without the modulation. Unstable patterns are formed. Right: a snapshot of a 2D simulation at the end of the training. The RL-algorithm learns to modulate in time ($m(t)$) the spatial modulation of the potential (the colormaps). Insets: the snapshots in Fourier-domain. Middle: the schematic representation of the RL-control sequence. The algorithms learns to change in time the amplitude of the spatially modulated potential. If the metric $L_n$ decreases the RL-agent is rewarded according to $R_n$.
  • Figure 3: Schematics of Q-Learning algorithm.
  • Figure 4: a) training performance of the RL-algorithm depending on parameters $q$ and $T$, b) Lyapunov exponents of time modulation provided by RL for the 1D case with $c=0.5$, $d=0.0$, c) reward (eq. \ref{['R']}) dependence on simulation time for different equation parameters, d) metric (eq. \ref{['L']}) dependence on simulation time for the same equation parameters. e) snapshot of the 2D simulation with $c=0.7$, $d=0.01$, f) the same snapshot with $c=0.5$, $d=0.01$, g) example of a failed MI suppression attempt by the trained model ($c=0.6$, $d=0.2$) in 1D simulation, h) example of a partially successful MI suppression attempt ($c=0.6$, $d=0.2$).