Table of Contents
Fetching ...

Learning Based NMPC Adaptation for Autonomous Driving using Parallelized Digital Twin

Jean Pierre Allamaa, Panagiotis Patrinos, Herman Van der Auweraer, Tong Duy Son

TL;DR

This work tackles the Sim2Real transfer problem for autonomous driving by introducing a data-efficient online adaptation framework that calibrates a parametrizable NMPC using executable digital twins (xDTs). It combines domain randomization with a derivative-free optimization loop driven by an adaptive UKF-SPSA fusion, enabling parallel exploration of NMPC parameters on xDTs while exploiting real-world data for final refinement. The approach includes safety checks and adaptive covariance to handle noise and the Sim2Real gap, and it is validated through simulation and real-road experiments, achieving a 75% improvement in tracking and reducing the Sim2Real gap from 876 to 1.033 with nine NMPC parameters tuned in under 10 minutes. The results demonstrate significant practical impact by enabling rapid, safe, and data-efficient end-of-line tuning, reducing hours of manual calibration, and improving generalization across paths and conditions.

Abstract

In this work, we focus on the challenge of transferring an autonomous driving controller from simulation to the real world (i.e. Sim2Real). We propose a data-efficient method for online and on-the-fly adaptation of parametrizable control architectures such that the target closed-loop performance is optimized while accounting for uncertainties as model mismatches, changes in the environment, and task variations. The novelty of the approach resides in leveraging black-box optimization enabled by Executable Digital Twins (xDTs) for data-driven parameter calibration through derivative-free methods to directly adapt the controller in real-time. The xDTs are augmented with Domain Randomization for robustness and allow for safe parameter exploration. The proposed method requires a minimal amount of interaction with the real-world as it pushes the exploration towards the xDTs. We validate our approach through real-world experiments, demonstrating its effectiveness in transferring and fine-tuning a NMPC with 9 parameters, in under 10 minutes. This eliminates the need for hours-long manual tuning and lengthy machine learning training and data collection phases. Our results show that the online adapted NMPC directly compensates for the Sim2Real gap and avoids overtuning in simulation. Importantly, a 75% improvement in tracking performance is achieved and the Sim2Real gap over the target performance is reduced from a factor of 876 to 1.033.

Learning Based NMPC Adaptation for Autonomous Driving using Parallelized Digital Twin

TL;DR

This work tackles the Sim2Real transfer problem for autonomous driving by introducing a data-efficient online adaptation framework that calibrates a parametrizable NMPC using executable digital twins (xDTs). It combines domain randomization with a derivative-free optimization loop driven by an adaptive UKF-SPSA fusion, enabling parallel exploration of NMPC parameters on xDTs while exploiting real-world data for final refinement. The approach includes safety checks and adaptive covariance to handle noise and the Sim2Real gap, and it is validated through simulation and real-road experiments, achieving a 75% improvement in tracking and reducing the Sim2Real gap from 876 to 1.033 with nine NMPC parameters tuned in under 10 minutes. The results demonstrate significant practical impact by enabling rapid, safe, and data-efficient end-of-line tuning, reducing hours of manual calibration, and improving generalization across paths and conditions.

Abstract

In this work, we focus on the challenge of transferring an autonomous driving controller from simulation to the real world (i.e. Sim2Real). We propose a data-efficient method for online and on-the-fly adaptation of parametrizable control architectures such that the target closed-loop performance is optimized while accounting for uncertainties as model mismatches, changes in the environment, and task variations. The novelty of the approach resides in leveraging black-box optimization enabled by Executable Digital Twins (xDTs) for data-driven parameter calibration through derivative-free methods to directly adapt the controller in real-time. The xDTs are augmented with Domain Randomization for robustness and allow for safe parameter exploration. The proposed method requires a minimal amount of interaction with the real-world as it pushes the exploration towards the xDTs. We validate our approach through real-world experiments, demonstrating its effectiveness in transferring and fine-tuning a NMPC with 9 parameters, in under 10 minutes. This eliminates the need for hours-long manual tuning and lengthy machine learning training and data collection phases. Our results show that the online adapted NMPC directly compensates for the Sim2Real gap and avoids overtuning in simulation. Importantly, a 75% improvement in tracking performance is achieved and the Sim2Real gap over the target performance is reduced from a factor of 876 to 1.033.
Paper Structure (22 sections, 12 equations, 10 figures, 3 tables, 1 algorithm)

This paper contains 22 sections, 12 equations, 10 figures, 3 tables, 1 algorithm.

Figures (10)

  • Figure 1: Vehicle-in-the-loop performance. Left: control policy over trained in simulation with only the nominal identified vehicle model to achieve less than 15 cm of accuracy, fails to transfer to the real-world and results in an unsafe driving style. Right: Enhancing the Sim2Real transfer through data-driven controller adaptation based on domain randomization, domain adaptation, high-fidelity simulation and with fine-tuning from real-world data
  • Figure 2: ${\textrm{D}^2\textrm{C}^2\textrm{-AUKS}}$: on-the-go parametrizable controller calibration that offloads the burden of parameter search and sensitivity analysis towards multiple Digital Twins, and efficiently closes the Sim2Real gap through real-world data flow
  • Figure 3: Adaptation framework: optimize the target domain (real-world) closed-loop performance $V$ safely and iteratively by sampling the different xDTs running in parallel with each other and with the real car. Framework combining Domain Randomization, Adaptation and High-fidelity simulation, and driven by real-world performance data.
  • Figure 4: MiL training on dynamic path: Top: Iteration 0: the ego vehicle follows the path with sub-optimal performance. The xDTs have a high variance in their performance; Bottom: Iteration 1: after one single training iteration, the ego performance has improved and the xDTs are overlapped
  • Figure 5: Adaptation process: evolution of path tracking and NMPC cost RMS on path 1, alongside the path tracking parameter $Q_w$. The learning rate on cost improvement is large, and the performance is enhanced within few iterations
  • ...and 5 more figures

Theorems & Definitions (4)

  • Remark 1
  • Remark 2
  • Remark 3
  • Remark 4