Table of Contents
Fetching ...

Self-Supervised Meta-Learning for All-Layer DNN-Based Adaptive Control with Stability Guarantees

Guanqi He, Yogita Choudhary, Guanya Shi

TL;DR

A novel learning-based adaptive control framework that pretrains a DNN for adaptation via self-supervised meta-learning (SSML) from offline trajectories and online adapts the full DNN via composite adaptation, which significantly outperforms various classic and learning-based adaptive control baselines.

Abstract

A critical goal of adaptive control is enabling robots to rapidly adapt in dynamic environments. Recent studies have developed a meta-learning-based adaptive control scheme, which uses meta-learning to extract nonlinear features (represented by Deep Neural Networks (DNNs)) from offline data, and uses adaptive control to update linear coefficients online. However, such a scheme is fundamentally limited by the linear parameterization of uncertainties and does not fully unleash the capability of DNNs. This paper introduces a novel learning-based adaptive control framework that pretrains a DNN via self-supervised meta-learning (SSML) from offline trajectories and online adapts the full DNN via composite adaptation. In particular, the offline SSML stage leverages the time consistency in trajectory data to train the DNN to predict future disturbances from history, in a self-supervised manner without environment condition labels. The online stage carefully designs a control law and an adaptation law to update the full DNN with stability guarantees. Empirically, the proposed framework significantly outperforms (19-39%) various classic and learning-based adaptive control baselines, in challenging real-world quadrotor tracking problems under large dynamic wind disturbance.

Self-Supervised Meta-Learning for All-Layer DNN-Based Adaptive Control with Stability Guarantees

TL;DR

A novel learning-based adaptive control framework that pretrains a DNN for adaptation via self-supervised meta-learning (SSML) from offline trajectories and online adapts the full DNN via composite adaptation, which significantly outperforms various classic and learning-based adaptive control baselines.

Abstract

A critical goal of adaptive control is enabling robots to rapidly adapt in dynamic environments. Recent studies have developed a meta-learning-based adaptive control scheme, which uses meta-learning to extract nonlinear features (represented by Deep Neural Networks (DNNs)) from offline data, and uses adaptive control to update linear coefficients online. However, such a scheme is fundamentally limited by the linear parameterization of uncertainties and does not fully unleash the capability of DNNs. This paper introduces a novel learning-based adaptive control framework that pretrains a DNN via self-supervised meta-learning (SSML) from offline trajectories and online adapts the full DNN via composite adaptation. In particular, the offline SSML stage leverages the time consistency in trajectory data to train the DNN to predict future disturbances from history, in a self-supervised manner without environment condition labels. The online stage carefully designs a control law and an adaptation law to update the full DNN with stability guarantees. Empirically, the proposed framework significantly outperforms (19-39%) various classic and learning-based adaptive control baselines, in challenging real-world quadrotor tracking problems under large dynamic wind disturbance.

Paper Structure

This paper contains 19 sections, 2 theorems, 24 equations, 3 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

Suppose the network $f$ and disturbance $d$ satisfy Assumptions assu:bounded and assu:dist. The states $s$, $q$, $\dot q$ of the closed-loop system eq:simplified-dynamics are bounded. Furthermore, there exists a constant $\bar{J}$ as the uniform upper bound of the Jacobian matrix $\|J\|$ for the net

Figures (3)

  • Figure 2: Self-supervised meta-learning (SSML) alternates between the adaptation stage and the training stage, pretraining the DNN initial parameter $\theta_0$ for rapid online adaptation.
  • Figure 3: Experiments setting of quadrotor trajectory tracking under large dynamic wind conditions.
  • Figure 4: Disturbance prediction and tracking performance of each controller. The PID controller struggles with handling unknown wind dynamics, while the INDI controller is affected by acceleration measurement noise and delays in disturbance prediction. The Vanilla-NN model only learns the average disturbance, showing limited flexibility in parameter adaptation. Both SSML-AC and SSML-AC-LL perform well in trajectory tracking; however, SSML-AC achieves more accurate disturbance prediction and lower tracking error (Table \ref{['table:comparison']}), highlighting the benefits of full-network adaptation.

Theorems & Definitions (4)

  • Theorem 1
  • proof
  • Theorem 2
  • proof