Table of Contents
Fetching ...

Safe Imitation Learning of Nonlinear Model Predictive Control for Flexible Robots

Shamil Mamedov, Rudolf Reiter, Seyed Mahdi Basiri Azad, Ruan Viljoen, Joschka Boedecker, Moritz Diehl, Jan Swevers

TL;DR

This work targets real-time, safe control of flexible robots, where nonlinear, high-dimensional dynamics hinder traditional NMPC. It introduces a framework that learns an NMPC policy through imitation learning (DAgger) and enforces safety with a predictive safety filter, achieving substantial speedups over NMPC while maintaining safety. Empirical results show an eightfold reduction in action computation time and superior performance to a strong RL baseline (SAC), with robustness to model-plant mismatch. The approach has practical implications for industrial adoption of flexible robots and suggests directions for extending to trajectory tracking and soft-robot applications.

Abstract

Flexible robots may overcome some of the industry's major challenges, such as enabling intrinsically safe human-robot collaboration and achieving a higher payload-to-mass ratio. However, controlling flexible robots is complicated due to their complex dynamics, which include oscillatory behavior and a high-dimensional state space. Nonlinear model predictive control (NMPC) offers an effective means to control such robots, but its significant computational demand often limits its application in real-time scenarios. To enable fast control of flexible robots, we propose a framework for a safe approximation of NMPC using imitation learning and a predictive safety filter. Our framework significantly reduces computation time while incurring a slight loss in performance. Compared to NMPC, our framework shows more than an eightfold improvement in computation time when controlling a three-dimensional flexible robot arm in simulation, all while guaranteeing safety constraints. Notably, our approach outperforms state-of-the-art reinforcement learning methods. The development of fast and safe approximate NMPC holds the potential to accelerate the adoption of flexible robots in industry. The project code is available at: tinyurl.com/anmpc4fr

Safe Imitation Learning of Nonlinear Model Predictive Control for Flexible Robots

TL;DR

This work targets real-time, safe control of flexible robots, where nonlinear, high-dimensional dynamics hinder traditional NMPC. It introduces a framework that learns an NMPC policy through imitation learning (DAgger) and enforces safety with a predictive safety filter, achieving substantial speedups over NMPC while maintaining safety. Empirical results show an eightfold reduction in action computation time and superior performance to a strong RL baseline (SAC), with robustness to model-plant mismatch. The approach has practical implications for industrial adoption of flexible robots and suggests directions for extending to trajectory tracking and soft-robot applications.

Abstract

Flexible robots may overcome some of the industry's major challenges, such as enabling intrinsically safe human-robot collaboration and achieving a higher payload-to-mass ratio. However, controlling flexible robots is complicated due to their complex dynamics, which include oscillatory behavior and a high-dimensional state space. Nonlinear model predictive control (NMPC) offers an effective means to control such robots, but its significant computational demand often limits its application in real-time scenarios. To enable fast control of flexible robots, we propose a framework for a safe approximation of NMPC using imitation learning and a predictive safety filter. Our framework significantly reduces computation time while incurring a slight loss in performance. Compared to NMPC, our framework shows more than an eightfold improvement in computation time when controlling a three-dimensional flexible robot arm in simulation, all while guaranteeing safety constraints. Notably, our approach outperforms state-of-the-art reinforcement learning methods. The development of fast and safe approximate NMPC holds the potential to accelerate the adoption of flexible robots in industry. The project code is available at: tinyurl.com/anmpc4fr
Paper Structure (20 sections, 9 equations, 6 figures, 2 tables)

This paper contains 20 sections, 9 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: An illustration of the problem addressed in this paper. The controller has to move the flexible robot's end-effector $p_\mathrm{ee}$ to the goal position $p_\mathrm{ee}^\mathrm{ref}$ while guaranteeing safety in terms of preventing the robot from colliding with the wall and ground obstacles.
  • Figure 2: Proposed framework for safely approximating NMPC policy by combining imitation learning of NMPC and a safety filter.
  • Figure 3: Illustration of two different levels of spatial discretization: one-segment (middle) and three-segment (right) discretizations.
  • Figure 4: The distribution of final distance-to-goal ($\|z(T) - z^\mathrm{ref} \|_2$ with $T$ being the episode duration) of the considered algorithms on 100 test tasks with and without safety filter.
  • Figure 5: Performance of the considered controllers with and without model-plant mismatch. This mismatch is implemented by reducing Young's modulus of the simulation model by 10%. The marker colors are defined by the inference time (policy evaluation time).
  • ...and 1 more figures