Table of Contents
Fetching ...

Active Learning of Discrete-Time Dynamics for Uncertainty-Aware Model Predictive Control

Alessandro Saviolo, Jonathan Frey, Abhishek Rathod, Moritz Diehl, Giuseppe Loianno

TL;DR

The paper tackles the challenge of modeling and controlling nonlinear robotic systems under varying conditions by learning discrete-time neural dynamics offline and refining them online. It introduces an uncertainty-aware MPC that conditions its cost on aleatoric uncertainty estimated via an Unscented Transform and updates only the last layer online to maintain stability and efficiency. Key contributions include a practical discrete-time NN dynamics model, an online last-layer adaptation mechanism, UT-based uncertainty estimation on manifolds, and an MPC framework that leverages this uncertainty to improve learning convergence and control performance, demonstrated on a quadrotor under payload, propeller mixing, and wind disturbances. The results show improved predictive accuracy and robustness over state-of-the-art continuous-time baselines and conventional adaptive controllers, with clear guidance on computational trade-offs for real-time embedded deployment.

Abstract

Model-based control requires an accurate model of the system dynamics for precisely and safely controlling the robot in complex and dynamic environments. Moreover, in the presence of variations in the operating conditions, the model should be continuously refined to compensate for dynamics changes. In this paper, we present a self-supervised learning approach that actively models the dynamics of nonlinear robotic systems. We combine offline learning from past experience and online learning from current robot interaction with the unknown environment. These two ingredients enable a highly sample-efficient and adaptive learning process, capable of accurately inferring model dynamics in real-time even in operating regimes that greatly differ from the training distribution. Moreover, we design an uncertainty-aware model predictive controller that is heuristically conditioned to the aleatoric (data) uncertainty of the learned dynamics. This controller actively chooses the optimal control actions that (i) optimize the control performance and (ii) improve the efficiency of online learning sample collection. We demonstrate the effectiveness of our method through a series of challenging real-world experiments using a quadrotor system. Our approach showcases high resilience and generalization capabilities by consistently adapting to unseen flight conditions, while it significantly outperforms classical and adaptive control baselines.

Active Learning of Discrete-Time Dynamics for Uncertainty-Aware Model Predictive Control

TL;DR

The paper tackles the challenge of modeling and controlling nonlinear robotic systems under varying conditions by learning discrete-time neural dynamics offline and refining them online. It introduces an uncertainty-aware MPC that conditions its cost on aleatoric uncertainty estimated via an Unscented Transform and updates only the last layer online to maintain stability and efficiency. Key contributions include a practical discrete-time NN dynamics model, an online last-layer adaptation mechanism, UT-based uncertainty estimation on manifolds, and an MPC framework that leverages this uncertainty to improve learning convergence and control performance, demonstrated on a quadrotor under payload, propeller mixing, and wind disturbances. The results show improved predictive accuracy and robustness over state-of-the-art continuous-time baselines and conventional adaptive controllers, with clear guidance on computational trade-offs for real-time embedded deployment.

Abstract

Model-based control requires an accurate model of the system dynamics for precisely and safely controlling the robot in complex and dynamic environments. Moreover, in the presence of variations in the operating conditions, the model should be continuously refined to compensate for dynamics changes. In this paper, we present a self-supervised learning approach that actively models the dynamics of nonlinear robotic systems. We combine offline learning from past experience and online learning from current robot interaction with the unknown environment. These two ingredients enable a highly sample-efficient and adaptive learning process, capable of accurately inferring model dynamics in real-time even in operating regimes that greatly differ from the training distribution. Moreover, we design an uncertainty-aware model predictive controller that is heuristically conditioned to the aleatoric (data) uncertainty of the learned dynamics. This controller actively chooses the optimal control actions that (i) optimize the control performance and (ii) improve the efficiency of online learning sample collection. We demonstrate the effectiveness of our method through a series of challenging real-world experiments using a quadrotor system. Our approach showcases high resilience and generalization capabilities by consistently adapting to unseen flight conditions, while it significantly outperforms classical and adaptive control baselines.
Paper Structure (21 sections, 20 equations, 10 figures, 6 tables)

This paper contains 21 sections, 20 equations, 10 figures, 6 tables.

Figures (10)

  • Figure 1: Online learning of the system dynamics. At each control iteration, the neural dynamics are used by the MPC to calculate the next control action to apply to the system and forward simulated to predict the next state. After actuating the predicted action by the controller, the state estimation algorithm provides the actual state reached. Finally, the neural dynamics' weights are updated by using the mismatch between the forward simulated state and the actual state reached by the robot.
  • Figure 2: Quadrotor system configurations used in this work. Payload (b) extends Default (a) with a cable-suspended payload that introduces a static model mismatch due to the mass increase and stochastic effects due to the unknown payload swinging motion. Mixed Propellers (c) critically changes Default (a) by using $4$ different propellers with various blade compositions.
  • Figure 3: Testing trajectories considered in this work.
  • Figure 4: Benefits of combining online learning with uncertainty estimation when continuously tracking WarpedEllipse (left) and Lemniscate (right) trajectories in the real world. The RMSE is accumulated over multiple loops to validate the increased stability introduced by the uncertainty estimation and online optimization. (Inset) Average RMSE over the entire flight performance.
  • Figure 5: Role of dynamics adaptation in multiple flight operating conditions. (Inset) Average RMSE.
  • ...and 5 more figures