Reinforcement learning-based adaptive time-integration for nonsmooth dynamics

David Michael Riley; Alexandros Stathas; Diego Gutiérrez-Oribio; Ioannis Stefanou

Reinforcement learning-based adaptive time-integration for nonsmooth dynamics

David Michael Riley, Alexandros Stathas, Diego Gutiérrez-Oribio, Ioannis Stefanou

TL;DR

The paper addresses the computational challenge of integrating nonsmooth dynamical systems by learning adaptive time-stepping policies with reinforcement learning. It adopts Truncated Quantile Critics (TQC) for continuous actions to select time steps, integrating with a variational-inequality framework for constraint enforcement and using Bathe time-stepping to preserve stability. Through three case studies—sliding-mode control, a Chua-like electrical circuit, and a frictional seismic fault—it demonstrates substantial speedups (up to an order of magnitude in some scenarios) while maintaining acceptable accuracy and showing generalization across discretizations, with transfer learning further boosting performance. The work establishes a general, data-driven alternative to heuristic and PI-based time stepping for complex nonsmooth systems, enabling faster parametric studies and potential real-time simulations.

Abstract

Numerical time integration is fundamental to the simulation of initial and boundary value problems. Traditionally, time integration schemes require adaptive time-stepping to ensure computational speed and sufficient accuracy. Although these methods are based on mathematical derivations related to the order of accuracy for the chosen integrator, they also rely on heuristic development to determine optimal time steps. In this work, we use an alternative approach based on Reinforcement Learning (RL) to select the optimal time step for any time integrator method, balancing computational speed and accuracy. To explore the potential of our RL-based adaptive time-stepping approach, we choose a challenging model problem involving set-valued frictional instabilities at various spatiotemporal scales. This problem demonstrates the robustness of our strategy in handling nonsmooth problems, which present a demanding scenario for numerical integration. Specifically, we apply RL to the simulation of a seismic fault with Coulomb friction. Our findings indicate that RL can learn an optimal strategy for time integration, achieving up to a fourfold speed-up. Our RL-based adaptive integrator offers a new approach for time integration in various other problems in mechanics.

Reinforcement learning-based adaptive time-integration for nonsmooth dynamics

TL;DR

Abstract

Paper Structure (26 sections, 24 equations, 10 figures, 5 tables)

This paper contains 26 sections, 24 equations, 10 figures, 5 tables.

Introduction
Problem description
Numerical constraint enforcement and integration
Heuristic-based Adaptive and PI time-stepping
Reinforcement Learning
Algorithm
Architecture
Reward function
Observational space
General training procedure for TQC
RL adaptive time-stepping performance
Case Study 1: First-order sliding mode controller
RL training for Sliding mode controller
Result
Case Study 2: Electrical circuit
...and 11 more sections

Figures (10)

Figure 1: Illustration of the logic applied within the Reinforcement Learning algorithm framework.
Figure 2: (a) State $w$ against time and (b) controller input $F_r$ against time for a scenario that the TQC network was not trained on.
Figure 3: (a-c) States $V_1$, $V_2$, $i$, and $F_r$ against time for constant, PI, heuristic, and RL-based integrators (TQC and TD3). Note that this evaluation is a scenario that the TQC network was not trained on.
Figure 4: Illustration of a strike-slip fault of dimensions $L_x$ and $L_z$ with a far-field velocity $v_\infty$.
Figure 5: Results demonstrating the ratio of the runtime of the heuristic-based method over the runtime of the RL-based method for (a) $L_x=L_z=3$ [km] fault and (b) $L_x=L_z=5$ [km] fault. The x-axis depicts the speed-accuracy trade-off parameter $\alpha$ (see \ref{['eq: reward']}), and the y-axis depicts the fault discretization ($N_x$ by $N_z$). Note that the white boxes with the "N/A" text indicate that scenarios in the RL-based integrator did not converge.
...and 5 more figures

Reinforcement learning-based adaptive time-integration for nonsmooth dynamics

TL;DR

Abstract

Reinforcement learning-based adaptive time-integration for nonsmooth dynamics

Authors

TL;DR

Abstract

Table of Contents

Figures (10)