Table of Contents
Fetching ...

MetaTune: Adjoint-based Meta-tuning via Robotic Differentiable Dynamics

Xiexin Peng, Bingheng Wang, Tao Zhang, Ying Zheng

Abstract

Disturbance observer-based control has shown promise in robustifying robotic systems against uncertainties. However, tuning such systems remains challenging due to the strong coupling between controller gains and observer parameters. In this work, we propose MetaTune, a unified framework for joint auto-tuning of feedback controllers and disturbance observers through differentiable closed-loop meta-learning. MetaTune integrates a portable neural policy with physics-informed gradients derived from differentiable system dynamics, enabling adaptive gain across tasks and operating conditions. We develop an adjoint method that efficiently computes the meta-gradients with respect to adaptive gains backward in time to directly minimize the cost-to-go. Compared to existing forward methods, our approach reduces the computational complexity to be linear in the data horizon. Experimental results on quadrotor control show that MetaTune achieves consistent improvements over state-of-the-art differentiable tuning methods while reducing gradient computation time by more than 50 percent. In high-fidelity PX4-Gazebo hardware-in-the-loop simulation, the learned adaptive policy yields 15-20 percent average tracking error reduction at aggressive flight speeds and up to 40 percent improvement under strong disturbances, while demonstrating zero-shot sim-to-sim transfer without fine-tuning.

MetaTune: Adjoint-based Meta-tuning via Robotic Differentiable Dynamics

Abstract

Disturbance observer-based control has shown promise in robustifying robotic systems against uncertainties. However, tuning such systems remains challenging due to the strong coupling between controller gains and observer parameters. In this work, we propose MetaTune, a unified framework for joint auto-tuning of feedback controllers and disturbance observers through differentiable closed-loop meta-learning. MetaTune integrates a portable neural policy with physics-informed gradients derived from differentiable system dynamics, enabling adaptive gain across tasks and operating conditions. We develop an adjoint method that efficiently computes the meta-gradients with respect to adaptive gains backward in time to directly minimize the cost-to-go. Compared to existing forward methods, our approach reduces the computational complexity to be linear in the data horizon. Experimental results on quadrotor control show that MetaTune achieves consistent improvements over state-of-the-art differentiable tuning methods while reducing gradient computation time by more than 50 percent. In high-fidelity PX4-Gazebo hardware-in-the-loop simulation, the learned adaptive policy yields 15-20 percent average tracking error reduction at aggressive flight speeds and up to 40 percent improvement under strong disturbances, while demonstrating zero-shot sim-to-sim transfer without fine-tuning.

Paper Structure

This paper contains 21 sections, 15 equations, 6 figures, 2 tables, 1 algorithm.

Figures (6)

  • Figure 1: Adjoint-based meta-learning framework for end-to-end tuning of controller--observer parameters. Gradients of a task-level loss are propagated backward through the closed-loop system dynamics via adjoint sensitivity analysis and subsequently transformed into meta-policy gradients using the chain rule, enabling automatic gain adaptation.
  • Figure 2: Computational efficiency comparison between forward and backward sensitivity analysis. (Top) The forward method requires a separate forward propagation rollout for each parameter perturbation to capture its effect on the final loss $L$, leading to expensive computations. (Bottom) The proposed backward method computes the cost-to-go function in a single backward sweep. Gradients for all parameters along the horizon are retrieved simultaneously, significantly reducing computational complexity from $O(N^2)$ to $O(N)$.
  • Figure 3: Trajectory evolution at representative training stages (Trials 50, 100, 150, and 200). (a) The DiffTune baseline exhibits persistent steady-state error, indicating limited adaptability when optimization is restricted to controller gains. (b) Adjoint substantially improves tracking performance, bringing the trajectory closer to the reference. (c) The proposed MetaTune method achieves the tightest tracking accuracy, suggesting that network-based parameterization can effectively exploit joint gradient information for precise control.
  • Figure 4: RMSE over training iterations.
  • Figure 5: Disturbance estimation and position tracking performance during hovering. (a) Estimated disturbance forces in the horizontal (XY) and vertical (Z) directions. (b) Position tracking along the X, Y, and Z axes.
  • ...and 1 more figures

Theorems & Definitions (1)

  • Remark 1