Reservoir Predictive Path Integral Control for Unknown Nonlinear Dynamics
Daisuke Inoue, Tadayoshi Matsumori, Gouhei Tanaka, Yuji Ito
TL;DR
This work addresses fast online control of unknown nonlinear dynamics by integrating Echo State Networks (ESN) with Model Predictive Path Integral (MPPI) control to form Reservoir Predictive Path Integral Control (RPPI). It further adds an uncertainty-aware extension (URPPI) that samples perturbed ESN output weights to minimize the expected cost under model uncertainty, enabling robust stochastic control without linearization. The approach is validated on a Duffing oscillator and a four-tank system, showing that URPPI can reduce control costs by up to about 60% compared with traditional quadratic programming MPC and outperforms MPPI, with a small increase in computation time. The contributions offer a practical framework for online learning and control of unknown nonlinear systems, with potential impact in robotics, process control, and aerospace applications.
Abstract
Neural networks have found extensive application in data-driven control of nonlinear dynamical systems, yet fast online identification and control of unknown dynamics remain central challenges. To meet these challenges, this paper integrates echo-state networks (ESNs)--reservoir computing models implemented with recurrent neural networks--and model predictive path integral (MPPI) control--sampling-based variants of model predictive control. The proposed reservoir predictive path integral (RPPI) enables fast learning of nonlinear dynamics with ESNs and exploits the learned nonlinearities directly in MPPI control computation without linearization approximations. This framework is further extended to uncertainty-aware RPPI (URPPI), which achieves robust stochastic control by treating ESN output weights as random variables and minimizing an expected cost over their distribution to account for identification errors. Experiments on controlling a Duffing oscillator and a four-tank system demonstrate that URPPI improves control performance, reducing control costs by up to 60% compared to traditional quadratic programming-based model predictive control methods.
